Convert a dta file to csv without Stata software

2022-12-24 18:17 问答作者：

Is there a way to convert a dta file to a csv?

I do not have a version of Stata installed on my comp开发者_运维百科uter, so I cannot do something like:

File --> "Save as csv"

The frankly-incredible data-analysis library for Python called Pandas has a function to read Stata files.

After installing Pandas you can just do:

>>> import pandas as pd
>>> data = pd.io.stata.read_stata('my_stata_file.dta')
>>> data.to_csv('my_stata_file.csv')

Amazing!

You could try doing it through R:

For Stata <= 15 you can use the haven package to read the dataset and then you simply write it to external CSV file:

library(haven)
yourData = read_dta("path/to/file")
write.csv(yourData, file = "yourStataFile.csv")

Alternatively, visit the link pointed by huntaub in a comment below.

For Stata <= 12 datasets foreign package can also be used

library(foreign)
yourData <- read.dta("yourStataFile.dta")

You can do it in StatTransfer, R or perl (as mentioned by others), but StatTransfer costs $$$ and R/Perl have a learning curve.
There is a free, menu-driven stats program from AM Statistical Software that can open and convert Stata .dta from all versions of Stata, see:

http://am.air.org/

I have not tried, but if you know Perl you can use the Parse-Stata-DtaReader module to convert the file for you.

The module has a command-line tool dta2csv, which can "convert Stata 8 and Stata 10 .dta files to csv"

Another way of converting between pretty much any data format using R is with the rio package.

Install R from CRAN and open R
Install the rio package using install.packages("rio")

Load the rio library, then use the convert() function:

library("rio")
convert("my_file.dta", "my_file.csv")

This method allows you to convert between many formats (e.g., Stata, SPSS, SAS, CSV, etc.). It uses the file extension to infer format and load using the appropriate importing package. More info can be found on the R-project rio page.

The R method will work reliably, and it requires little knowledge of R. Note that the conversion using the foreign package will preserve data, but may introduce differences. For example, when converting a table without a primary key, the primary key and associated columns will be inserted during the conversion.

From http://www.r-bloggers.com/using-r-for-stata-to-csv-conversion/ I recommend:

library(foreign)
write.table(read.dta(file.choose()), file=file.choose(), quote = FALSE, sep = ",")

In Python, one can use statsmodels.iolib.foreign.genfromdta to read Stata datasets. In addition, there is also a wrapper of the aforementioned function which can be used to read a Stata file directly from the web: statsmodels.datasets.webuse.

Nevertheless, both of the above rely on the use of the pandas.io.stata.StataReader.data, which is now a legacy function and has been deprecated. As such, the new pandas.read_stata function should now always be used instead.

According to the source file of stata.py, as of version 0.23.0, the following are supported:

Stata data file versions:

Valid encodings:

ascii
us-ascii
latin-1
latin_1
iso-8859-1
iso8859-1
8859
cp819
latin
latin1
L1

As others have noted, the pandas.to_csv function can then be used to save the file into disk. A related function numpy.savetxt can also save the data as a text file.

EDIT:

The following details come from help dtaversion in Stata 15.1:

        Stata version     .dta file format
        ----------------------------------------
               1               102
            2, 3               103
               4               104
               5               105
               6               108
               7            110 and 111
            8, 9            112 and 113
          10, 11               114
              12               115
              13               117
              14 and 15        118 (# of variables <= 32,767)
              15               119 (# of variables > 32,767, Stata/MP only)
        ----------------------------------------
        file formats 103, 106, 107, 109, and 116
        were never used in any official release.

StatTransfer is a program that moves data easily between Stata, Excel (or csv), SAS, etc. It is very user friendly (requires no programming skills). See www.stattransfer.com

If you use the program just note that you will have to choose "ASCII/Text - Delimited" to work with .csv files rather than .xls

Some mentioned SPSS, StatTransfer, they are not free. R and Python (also mentioned above) may be your choice. But personally, I would like to recommend Python, the syntax is much more intuitive than R. You can just use several command lines with Pandas in Python to read and export most of the commonly used data formats:

import pandas as pd

df = pd.read_stata('YourDataName.dta')

df.to_csv('YourDataName.csv')

SPSS can also read .dta files and export them to .csv, but that costs money. PSPP, an open source version of SPSS, which is rough, might also be able to read/export .dta files.

PYTHON - CONVERT STATA FILES IN DIRECTORY TO CSV

import glob
import pandas

path=r"{Path to Folder}"

for my_dir in glob.glob("*.dta")[0:1]:
    file = path+my_dir  # collects all the stata files
    # get the file path/name without the ".dta" extension
    file_name, file_extension = os.path.splitext(file)

    # read your data
    df = pandas.read_stata(file, convert_categoricals=False, convert_missing=True)

    # save the data and never think about stata again :)
    df.to_csv(file_name + '.csv')

For those who have Stata (even though the asker does not) you can use this:

outsheet produces a tab-delimited file so you need to specify the comma option like below

outsheet [varlist] using file.csv , comma

also, if you want to remove labels (which are included by default

outsheet [varlist] using file.csv, comma nolabel

hat tip to:

http://www.ats.ucla.edu/stat/stata/faq/outsheet.htm

继续阅读：csv file-conversion stata

Convert a dta file to csv without Stata software

Stata data file versions:

Valid encodings:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Stata data file versions:

Valid encodings:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？