Downloading WRDS Data

This page covers functions for exploring the WRDS database and downloading data from Compustat, CRSP, and Fama-French. All functions take a database connection as the first argument (see Establish DB Connection).

These functions query WRDS tables whose names are stored in WRDSMerger.default_tables. If your database uses different table names, update the dictionary before calling these functions.

Explore WRDS

These functions help you discover what data is available in WRDS.

WRDSMerger.describe_tableFunction
describe_table(conn, library::String, table::String)

Get a table's description (row count, columns, column types)

source
WRDSMerger.get_tableFunction
get_table(
                conn,
                library::String,
                table::String;
                obs::Union{Nothing, Int} = nothing,
                offset::Int = 0,
                cols = nothing
            )

Create a DataFrame from a table

source
WRDSMerger.raw_sqlFunction
raw_sql(
    conn,
    query::String
)

Executes raw sql code, and converts code to a DataFrame

source

Compustat

Download fundamental financial data from Compustat. By default downloads annual data; set annual=false for quarterly. Note that column names differ between annual and quarterly datasets (e.g., sale vs saleq).

WRDSMerger.comp_dataFunction
function comp_data(
    conn[,
    gvkeys::AbstractArray{String},]
    dateStart::Union{Date,Int}=1950,
    dateEnd::Union{Date,Int}=Dates.today();
    annual::Bool=true,
    filters::Dict{String,<:Any}=Dict(
        "datafmt" => "STD",
        "indfmt" => "INDL",
        "consol" => "C",
        "popsrc" => "D"
    ),
    cols::Array{String}=["gvkey", "datadate", "fyear", "sale", "revt", "xopr"]
)

Downloads data from Compustat for firms (if list of gvkeys is provided, filters to those firms) available firms over a period. Data can be annual (set annual=true) or quarterly (set annual=false). For quarterly data, you also likely need to change the columns that are downloaded (ie, sales is "saleq" in quarterly data).

Filters is a dictionary of String => String (or array of String) pairings that will be applied to the SQL query.

source

CRSP

Functions for downloading CRSP stock data, market indices, delisting returns, and making common adjustments (split-adjusting prices, incorporating delisting returns, etc.).

To use monthly data instead of daily, update the default tables:

WRDSMerger.default_tables["crsp_stock_data"] = "crsp.msf"
WRDSMerger.default_tables["crsp_index"] = "crsp.msi"
WRDSMerger.default_tables["crsp_delist"] = "crsp.msedelist"
WRDSMerger.crsp_stocknamesFunction
function crsp_stocknames(
    conn;
    cols::Array{String}=["permno", "cusip", "ncusip", "comnam", "namedt", "nameenddt", "ticker"]
)

function crsp_stocknames(
    conn,
    cusip::Array{String};
    cols::Array{String}=["permno", "cusip", "ncusip", "comnam", "namedt", "nameenddt", "ticker"],
    cusip_col="cusip", # either "cusip" "ncusip" or "ticker"
)

function crsp_stocknames(
    conn,
    permno::Array{<:Number};
    cols::Array{String}=["permno", "cusip", "ncusip", "comnam", "namedt", "nameenddt", "ticker"],
)

Download crsp.stockname data (with non-missing ncusip). If an array of strings is passed, by default assumes it is a list of cusips, can be a list of ncusip or tickers (change cusip_col). Can also take an array of numbers which is assumed to be a list of permnos.

source
WRDSMerger.crsp_marketFunction
function crsp_market(
    conn,
    dateStart::Union{Date,Int}=1950,
    dateEnd::Union{Date,Int}=Dates.today();
    col::Union{String, Array{String}}="vwretd"
)

Downloads the data from the daily or monthly stock index file (dsi and msi) for a range of dates with one value for each day (with various return columns). Available columns are:

  • "vwretd": Value weighted return with dividends
  • "vwretx": Value weighted return without dividends
  • "ewretd": Equal weighted return with dividends
  • "ewretx": Equal weighted return without dividends
  • "sprtrn": S&P500 return
source
WRDSMerger.crsp_dataFunction
function crsp_data(
    conn,
    [permnos::Vector{<:Real},]
    s::Date=Date(1925),
    e::Date=today();
    cols = ["ret", "vol", "shrout"],
    filters::Dict{String, <:Any}=Dict{String, Any}(),
    adjust_crsp_data::Bool=true
)

function crsp_data(
    conn,
    permnos::Vector{<:Real},
    dates::Vector{Date},
    [dates_end::Vector{Date}];
    cols=["ret", "vol", "shrout"],
    adjust_crsp_data::Bool=true,
    query_size_limit::Int=3000
)

Downloads data from the crsp stockfiles, which are individual stocks. To download the data from the monthly stockfile, change the default table to the monthly stockfile:

WRDSMerger.default_tables["crsp_stock_data"] = "crsp.msf"
WRDSMerger.default_tables["crsp_index"] = "crsp.msi"
WRDSMerger.default_tables["crsp_delist"] = "crsp.msedelist"

Arguments

  • permnos::Vecotr{<:Real}: A vector of firm IDs, if provided, will only download data for those firms
  • s::Date=Date(1925) and e::Date=today(): Downloads all data between two dates provided
  • dates::Vector{Date}: Downloads data for a set of permnos on the date provided
    • dates_end::Vector{Date}: If provided, then treats the dates as the start of a period and will download data for the permnos between the two dates
  • adjust_crsp_data::Bool=true: This will call crsp_adjust with all options set to true, it will only do the operations that it has the data for.
source
WRDSMerger.crsp_delistFunction
function crsp_delist(
    conn;
    cols::Array{String}=[
        "permno",
        "dlstdt",
        "dlret"
    ],
    date_start::Date=Date(1926),
    date_end::Date=today()
)

Fetches the CRSP delist dataset, typically for the returns on the day of delisting.

source
WRDSMerger.crsp_adjustFunction
function crsp_adjust(
    conn,
    df::DataFrame;
    kwargs...
)

This makes 4 common adjustments to CRSP data:

  1. When the price in CRSP is negative, that means there was not a specific close price. Typically, researchers take the absolute value to get the actual price.
  2. Prices are not adjusted for splits (returns are). CRSP includes the number (cfascpr) that will adjust prices to be comparable through time
  3. Similar to prices, shares outstanding are not adjusted. The number used to adjust is different due to various events.
  4. Prices are not adjusted for delisting, so this downloads the necessary dataset and adjusts returns accordingly.

Arguments

Options on what to adjust

  • adjustprcnegatives::Bool=true: Corresponds to (1) above
  • adjustprcsplits::Bool=true: Corresponds to (2) above
  • adjustshrsplits::Bool=true: Corresponds to (3) above
  • adjust_delist::Bool=true: Corresponds to (4) above

Pre-existing column names

  • date::String="date": date column
  • idcol::String="permno": primary identifier (permno) column
  • prc_col::String="prc": price column
  • ret_col::String="ret": return column
  • prcsplitscol::String="cfacpr": price split adjustment factor
  • shrout_col::String="shrout": shares outstanding
  • shroutsplitscol::String="cfacshr": share split adjustment factor

Options to relabel columns

  • adjustednegprc_col::String="prc": relabel prices to this after taking absolute value
  • adjustedprccol::String="prc_adj": relabel prices to this after adjusting for splits
  • adjustedshroutcol::String="shrout_adj":relabel shares outstanding to this after adjusting for splits
source

Fama-French Factors

Download daily Fama-French factor data, including the market, size, value, and momentum factors.

WRDSMerger.ff_dataFunction
ff_data(conn, date_start=Date(1926,7,1), date_end=today(); cols=["date","mktrf","smb","hml","rf","umd"])

Download Fama-French factor data from the ff.factors_daily table (configurable via default_tables["ff_factors"]). Returns a DataFrame with the requested columns over the given date range.

Available columns include mktrf, smb, hml, rf, and umd (momentum).

source