Stata programs (snippets)

These are a couple of programs that I have written for Stata. They come without instructions as most of them I believe are rather self explanatory to the average Stata user. If you want to use them just copy paste the code into your .do-file or save the code in an .ado-file at put it in your ado folder.  If you don’t know where that folder is located type “sysdir” in your Stata console.

FAST MAX  : Equivalent to the egen (max) command but speeds up the process by a factor of 20.

/*           FASTMAX PROGRAM
Creator:     Jonas Cederlöf
Date:        March 2017
Description: The program draws on the "fegen" package by Sergio Corriea.
             The purpose of the program is to speed up the max function
             in the much slower egen command.
cap program drop fastmax
program define fastmax, rclass
    syntax varlist [if] [in] , [by(varlist)] name(string)

    tempvar maxvar
    clonevar `maxvar' = `varlist'

    if  "`by'" != "" {
        bys `by' :     replace `maxvar' = max( `maxvar'[_n-1], `maxvar') 
        bys `by' :     gen      `name'   =  `maxvar'[_N] 
    else {
                    replace `maxvar' = max( `maxvar'[_n-1], `maxvar') 
                    gen      `name'   = `maxvar'[_N] 

GSAMPLE : Sampling by group (a command that is completely lacking in Stata)

* Created by     : Jonas Cederlöf
* Date           : February 2017
* Contact        :
* Description    : Random sampling by group-var. Keeps all observations within 
*                  specified group while keeping keep(.%) of the population. 

cap program drop gsample
program define gsample , rclass
    syntax varlist [if] [in] , keep(numlist>0)
    qui count
    local xN = r(N)
    tempvar x_randid
    tempvar x_rand
    tempvar x_maxrand
    qui bys `varlist' : gen  `x_randid'  = `varlist'[_n==1]
    qui                 gen  `x_rand'      = runiform() if `x_randid'!=.
    qui bys `varlist' : egen `x_maxrand' = max(`x_rand')
    local temp = `keep'*100
    qui keep if `x_maxrand'< `keep'
    qui count 
    local xnewN = r(N)
    display "You have sampled `temp'% of the population by the variable(s) `varlist'."
    display "Number of remaning observations are `xnewN' out of the original `xN'." 

end program