Eszter's Stata Goodies Page
|
Stata is my choice for
statistical software package. I've put together this page with some
helpful resources for those who are just starting out with it or those
who've been using it for a while but who have not bothered to explore it
much. Of course, there are tons of such resources out there already.
This page is mainly for me to reference helpful sites I've already found
once and to point my colleagues to them easily.
On this page:
Getting Started | How-to
Pages | Using DO Files | Text
Editor | Helpful Hints | ADO
Files | More
GETTING STARTED
Although I mainly use Stata for Windows, all of the following are also
relevant to Stata under UNIX (which I sometimes still use).
HOW-TO PAGES
Before
you start - info on what you should know about Stata before you even
start, e.g. details about variable names (although note that this is a bit
outdated and is not about version 7 which allows variable names of up to
24 characters long), variable width, variable type, etc.
Some more basics are available on Princeton's Stata
Tutorial.
UCLA has some
great
resources for learning how to use Stata in more depth (they also have a
helpful search utility for their site)
USING DO-FILES
Do-files allow you to 1. keep track of everything you've done
to/with your data so your actions can be replicable; 2. run tons of
commands quickly. You can also
think of
it as a safety mechanism for allowing you to easily go back to your
original data set no matter what transformations you may have performed on
it (or what variables you may have mistakenly deleted - yikes!).
Here's an
example do-file template: stataexample.do
Alternative, same info but with
explanations of what each line means: stataexample
(Although I had to give it a .txt extension for it to display without
problems online, if you save it, make sure to save it with a .do
extension.)
Note that I have a little information section on the top of
that do-file. That is so you can keep track of what project this do-file
is for, where it is located and what it does.
TEXT EDITOR You can open your DO FILES
in any text editor. If you are using Stata in Windows then you can simply
go to the Windows menu and select Do-file editor (or press CTRL-8).
However, this editor is quite limited in capacity as are the editors that
automatically come with Windows: Notepad and Wordpad.
Instead of
these options, I recommend using UltraEdit (shareware $35.00) because it comes with some
nice additional features that will make editing do-files much more
convenient. You can download a Stata7
configuration for it, which will automatically highlight certain words for
you to distinguish commands from comment sections and the general body of
your file. To use the Stata word list, go to Advanced->Configuration in
UltraEdit and under Syntax Highlighting select the Stata7.txt file.
You may want to add some additional words to the Stata7.txt
wordlist for highlighting. You can do this by editing the Stata7.txt file.
Here are some of the things I have found most helpful in
UltraEdit:
Use Find function across files (search all .do files in a directory
for a word or phrase)
Use Replace function across files (replace a word or phrase across
several files at once)
Open numerous documents at the same time without cluttering the
Windows bottom panel, UltraEdit has its own panel. If you're used to it
being on the bottom, you can move it there.
SOME HELPFUL HINTS
If you want to see how much time it took to run your last command
and
if you want to be reminded of the time, set return messages as follows:
set rmsg on
The default memory setting may be too low for the size of your data
set. You can change it with the following:
set mem XXm
where XX is the size of
the memory you prefer (e.g. 20m).
Initially you can use no more than 40 variables in a model. You
can
change this with matsize (not in Small
Stata though):
set matsize # (the maximum is 800 in
Intercooled Stata)
You can easily create dummy variables:
tabulate varname,
generate(dummyname)
!!! Make tables that you can import into other documents
straight with nice layout and all pertinent information:
mktab - this saved my life! Thank
you Conrad!
Here's an example of a command to make this run (more info is
available in the
ado help file that comes with mktab).
mktab (depvar iv1 iv2 iv3) (depvar iv1 iv2
iv3 iv4), cmd(reg) aux(_cons=Intercept) est(N, r2=R2, r2_a=Adjusted R2)
flag(.1=***,1=**,5=*,10=#) notags efmt(%4.3f) xlabel ylabel log
(logfile.log, replace) continue screen
which will give you (and save as logfile.log output) a table
in which the first column has the names of your independent variables, the
second column has the coefficients and standard errors of the first model
with the three independent variables (iv1, iv2, iv3) in which
significance
is asterisked, and the third model includes the coefficients and standard
errors of the second model with four variables (iv1, iv2, iv3, iv4), again
asterisking those with specified significance. You also get the intercept
plus info on N, R2 value, and adjusted R2 plus significance levels noted
at the bottom of the table.
Random commands I've found helpful to know about
typing , obs after pwcorr lets you know how many
observations were used for each pairwise correlation (corr is
listwise)
typing , sig star(#) after pwcorr identifies the coefficients that are
significant at the # level with a star (you replace the # with what level
you want so (10) or (.1) for ten percent (use corr is
listwise)
format allows you to specify the display format (e.g.
details up to no more than two decimal points) format var %9.2f
list allows you to list some
variables for all cases (or for whatever cases you specify); if you sort
on one of them before running the list command then they will be listed
according to that variable
To get several graphs on one image, create the graphs and
then:
graph using g1 g2 g3 where g1, g2, g3 are your various graphs.
Dealing with files (merging, conversion)
How to
merge multiple files
How to use a Stata
graph in Microsoft documents (such as Word, PowerPoint)
HELPFUL ado FILES
To install these ado files from a network connected machine, from within
STATA use: net from
http://url-of-data-source
net install name-of-ado-file
If you're looking for an ado file but don't know its location, just
use
net search nameoffile
or you can try: findit
nameoffile
Data manipulation
cutv6
-
to quickly recode continuous
variables into groups
Data analysis
dlogit2
-
to compute marginal effects for logistic regression, probit
regression, and multinomial logistic regression
(see detailed documentation here
)
bys
- "by" and "sort" in one command
Creating graphs
fbar for some additional bar graph
options
(nicer than hist)
Creating tables (as per the above)
mktab
MORE...
Not enough? Check out these additional resources:
My friend Diane's Stata
Tricks page
STATA
Mailing List archives
Still don't know what to do? Do a search on Google and make your search phrase as
specific as possible (e.g. stata merge data two
files if you're looking for info on how to merge two data files in
Stata)
Feel free to leave your mark, please sign my Guestbook.
Back to Eszter Hargittai's
Homepage
Last updated: June, 2003
http://www.princeton.edu/~eszter/stata.html
|