Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: Regression by industry and year excluding firm i |

Date |
Fri, 13 Dec 2013 20:18:56 +0000 |

This is very good advice. Nick njcoxstata@gmail.com On 13 December 2013 19:41, Sarah Edgington <sedging@ucla.edu> wrote: > Ahmed, > As an aside, this is strikes me as one of those instances where you would > benefit a great deal from debugging your code on a subset of your data. You > need enough data for your regressions to run without errors but I'd try > getting the loop working on a subset of a few hundred observations rather > than the whole data set. That will run much more quickly. The resulting > predictions will be nonsense but they'll serve as a proof of concept. Once > you're happy that you have code that does what you expect you can run it on > the whole dataset with a certain amount of confidence that even if it takes > a very long time, you'll get the results that reflect your intended process. > -Sarah > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Fernando Rios > Avila > Sent: Friday, December 13, 2013 11:21 AM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: Regression by industry and year excluding firm i > > Ahmed, > In addition to Nick Cox comments, keep in mind that based on your > explanation, you need to run 95000 regressions. which will be very time > consuming. But, computer time is "cheap". > I would suggest, however, to clarify if each observation represent a > different Firm, which is assumption on how your code and Nick's are handling > the problem. > Fernando > HTH > > On Fri, Dec 13, 2013 at 2:12 PM, Nick Cox <njcoxstata@gmail.com> wrote: >> Sorry, no. >> >> The code hasn't finished running, so >> >> 1. Good news. No obvious bug. >> >> 2. I'd expect that code to be slow. You want a regression for every >> observation. >> >> I don't think you've demonstrated anything wrong with my code, so I >> can't possibly fix it. That doesn't mean the code must be right, but >> you need to show me incorrect results first. The point is that your >> code would, I imagine, have been even slower had it been correct. >> Several of the changes I made would have speeded up things compared >> with your code. >> >> I don't have your data to test anything, but without wanting to seem >> arrogant, I think you need to be confident that I made a mistake >> before you change my code. >> >> Nick >> njcoxstata@gmail.com >> >> >> On 13 December 2013 19:01, Abdalla, Ahmed <ahmed.abdalla@kcl.ac.uk> wrote: >>> Dear Nick >>> Many Thanks for that. >>> I understand your code now. I ran it. However, STATA has been running the > loop for more than 40 minutes now and I got no output !!! >>> I will explain more: >>> I have a model: >>> wce= b0+b1wlag_ce+b2 wato+b3 wlag_acc +b4wacc+b5 wdsale+b6 wndsale >>> >>> I want to run this model using all observations in a particular industry > -year excluding firm i. Expected wce for firm i are measured using the > coefficients I obtain from the industry year regressions multiplied by the > actual values of the variables in the model for firm i. >>> As far as I understand your code should achieve my target, but it took > long time and didn't give any results ! >>> I even tried another code that worked well and give me results in > seconds, but it doesn't exclude firm i from the estimation. I will write > this code for you here: >>> egen sic2id=group(sic_2 datadate) >>> egen count=count(sic2id), by(sic2id) >>> drop if count<10 >>> drop count >>> drop sic2id >>> egen sic2id=group(sic_2 datadate) >>> >>> gen b0=. >>> gen b1= . >>> gen b2=. >>> gen b3=. >>> gen b4=. >>> gen b5=. >>> gen b6=. >>> >>> sum sic2id >>> scalar max2=r(max) >>> local k=max2 >>> set more off >>> forvalues x=1(1)`k'{ >>> capture reg wce wlag_ce wato wlag_acc wacc wdsale wndsale if sic2id==`x' >>> capture replace b0= _b[_cons] >>> capture replace b1= _b[wlag_ce] >>> capture replace b2= _b[wato] >>> capture replace b3= _b[wlag_acc] >>> capture replace b4= _b[wacc] >>> capture replace b5= _b[wdsale] >>> capture replace b6= _b[wndsale] >>> } >>> >>> I appreciate if you can explain what was wrong with your code and update > the new code I have posted here to exclude firm i. >>> >>> >>> >>> >>> ________________________________________ >>> From: owner-statalist@hsphsun2.harvard.edu >>> <owner-statalist@hsphsun2.harvard.edu> on behalf of Nick Cox >>> <njcoxstata@gmail.com> >>> Sent: 13 December 2013 18:03 >>> To: statalist@hsphsun2.harvard.edu >>> Subject: Re: st: Regression by industry and year excluding firm i >>> >>> Remarks >>> >>> 1. If you are cycling over observations, you don't need a variable >>> containing observation numbers, nor to use -levelsof-. >>> >>> 2. -in- is always faster than the corresponding -if-. >>> >>> 3. wlag_ce=!=. is presumably a typo, but to Stata it will be illegal > syntax. >>> >>> 4. -capture replace b0= _b[_cons]- will end with the last intercept >>> calculated. I guess you don't want that. >>> >>> 5. Checking for missing values is redundant as -regress- will never >>> include them. >>> >>> With these and some other small tricks, here is an attempt at >>> rewriting your code. >>> >>> local X wlag_ce wato wlag_acc wacc wdsale wndsale tokenize "`X'" >>> >>> forval j = 0/6 { >>> gen b`j'=. >>> } >>> >>> forval i = 1/`=_N' { >>> local same sic_2[`i'] == sic_2 & datadate[`i'] == datadate qui count >>> if `same' & _n != `i' >>> >>> if r(N) > 10 { >>> reg wce `X' if `same' & _n != `i' >>> } >>> >>> quietly if _rc == 0 { >>> replace b0 = _b[_cons] in `i' >>> forval j = 1/6 { >>> replace b`j' = _b[``j''] in `i' >>> } >>> } >>> } >>> >>> gen pred_ce= b0 + b1*wlag_ce + b2*wato + b3*wlag_acc + /// b4*wacc + >>> b5*wdsale + b6*wndsale >>> >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 13 December 2013 17:33, Abdalla, Ahmed <ahmed.abdalla@kcl.ac.uk> > wrote: >>>> Dear Statalist >>>> I run a regression to estimate core earnings for each variable in my > dataset. The regression is run using all observations in a particular > industry year EXCLUDING firm i. Expected core earnings for firm i is > estimated using the coefficients multiplied by the actual values of > variables in the model for firm i. >>>> I run the following code. >>>> >>>> First: I get an error message for macro length being exceeded. >>>> Second: I try to use other commands for looping, the loop runs but it > gives me error message for invalid syntax. >>>> My problem is on how to exclude firm i ? I hope if you have any > suggestions regarding running regressions by industry and year and excluding > firm i from the estimation procedures. >>>> >>>> >>>> gen obs= [_n] >>>> gen runn=1 >>>> >>>> gen b0=. >>>> gen b1= . >>>> gen b2=. >>>> gen b3=. >>>> gen b4=. >>>> gen b5=. >>>> gen b6=. >>>> >>>> levelsof obs,local(levels) >>>> foreach x of local levels{ >>>> gen mark=1 if obs==runn >>>> gen sic_lp= sic_2 if obs ==runn >>>> qui summ sic_lp >>>> replace sic_lp = r(mean) if sic_lp==. >>>> gen datadate_lp= datadate if obs == runn qui summ datadate_lp >>>> replace datadate_lp = r(mean) if datadate_lp==. >>>> format datadate_lp %d >>>> gen sample =1 if sic_lp== sic_2 & datadate_lp== datadate & sale !=. & > wce !=. & wlag_ce=!=. & wato !=. & wacc !=. & wlag_acc!=. & wdsale !=. & > wndsale !=. >>>> egen sample_sum= sum(sample) if mark != 1 capture reg wce wlag_ce >>>> wato wlag_acc wacc wdsale wndsale if sample==1 & mark != 1 & >>>> sample_sum >10 capture replace b0= _b[_cons] capture replace b1= >>>> _b[wlag_ce] if obs==runn capture replace b2= _b[wato] if obs==runn >>>> capture replace b3= _b[wlag_acc] if obs==runn capture replace b4= >>>> _b[wacc] if obs==runn capture replace b5= _b[wdsale] if obs==runn >>>> capture replace b6= _b[wndsale] if obs==runn drop mark sic_lp >>>> datadate_lp sample sample_sum replace runn= runn+1 } >>>> >>>> gen pred_ce= b0+ b1*wlag_ce + b2*wato +b3*wlag_acc + b4*wacc + >>>> b5*wdsale + b6*wndsale >>>> >>>> >>>> I appreciate your help >>>> >>>> >>>> >>>> >>>> >>>> >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Regression by industry and year excluding firm i***From:*"Abdalla, Ahmed" <ahmed.abdalla@kcl.ac.uk>

**Re: st: Regression by industry and year excluding firm i***From:*Nick Cox <njcoxstata@gmail.com>

**RE: st: Regression by industry and year excluding firm i***From:*"Abdalla, Ahmed" <ahmed.abdalla@kcl.ac.uk>

**Re: st: Regression by industry and year excluding firm i***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Regression by industry and year excluding firm i***From:*Fernando Rios Avila <f.rios.a@gmail.com>

**RE: st: Regression by industry and year excluding firm i***From:*"Sarah Edgington" <sedging@ucla.edu>

- Prev by Date:
**Re: st: Regression by industry and year excluding firm i** - Next by Date:
**Re: st: Regression by industry and year excluding firm i** - Previous by thread:
**Re: st: Regression by industry and year excluding firm i** - Next by thread:
**st: From: Marc Peters <marcpeters1002@gmail.com>** - Index(es):