% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/read_ascii.R
\name{read_ascii}
\alias{read_ascii}
\title{Read ASCII datasets downloaded from the Roper Center}
\usage{
read_ascii(
  file,
  total_cards = 1,
  var_names,
  var_cards = 1,
  var_positions,
  var_widths,
  card_pattern,
  respondent_pattern
)
}
\arguments{
\item{file}{A path to an ASCII data file.}

\item{total_cards}{For multicard files, the number of cards in the file.}

\item{var_names}{A string vector of variable names.}

\item{var_cards}{For multicard files, a numeric vector of the cards on which \code{var_names} are recorded.}

\item{var_positions}{A numeric vector of the column positions in which \code{var_names} are recorded.}

\item{var_widths}{A numeric vector of the widths used to record \code{var_names}.}

\item{card_pattern}{For use when the file does not contain a line for every card for every respondent (or contains extra lines that correspond to no respondent), a regular expression that matches the file's card identifier; e.g., if the card number is stored in the last digit of each line, "\\d$".}

\item{respondent_pattern}{For use when the file does not contain a line for every card for every respondent (or contains extra lines that correspond to no respondent), a regular expression that matches the file's respondent identifier; e.g., if the respondent number is stored in the first four digits of each line, preceded by a space, "(?<=^\\s)\\d{4}".}
}
\value{
A data frame containing any variables specified in the \code{var_names} argument, plus a numeric \code{respondent} identifier and as many string \code{card} variables (\code{card1}, \code{card2}, ...) as specified by the \code{total_cards} argument.
}
\description{
\code{read_ascii} helps format ASCII data files downloaded from the Roper Center.
}
\details{
Many older Roper Center datasets are available only in ASCII format, which is notoriously difficult to work with.  The `read_ascii` function facilitates the process of extracting selected variables from ASCII datasets. For single-card files, one can simply identify the names, positions, and widths of the needed variables from the codebook and pass them to \code{read_ascii}'s \code{var_names}, \code{var_positions}, and \code{var_widths} arguments.  Multicard datasets are more complicated. In the best case, the file contains one line per card per respondent; then, the user can extract the needed variables by adding only the \code{var_cards} and \code{total_cards} arguments. When this condition is violated---there is not a line for every card for every respondent, or there are extra lines---the function will throw an error and request the user specify the additional arguments \code{card_pattern} and \code{respondent_pattern}.
}
\examples{
\dontrun{
# a single-card file
roper_download("USAIPO1982-1197G", # Gallup Poll for June 25-28, 1982
               download_dir = tempdir())  # remember to specify a directory for your download
                      
gallup1982 <- read_ascii(file = file.path(tempdir(), "USAIPO1982-1197G",
                                          "1197.dat"),
                         var_names = c("q09j", "weight"),
                         var_positions = c(38, 1),
                         var_widths = c(1, 1))
   
# a multi-card file, with extra lines that make the card_pattern and
  respondent_pattern arguments necessary
roper_download("USAIPOCNUS1996-9603008", # Gallup/CNN/USA Today Poll: Politics/1996 Election
               download_dir = tempdir())  # remember to specify a directory for your download

gallup1996 <- read_ascii(file = file.path(tempdir(), "USAIPOCNUS1996-9603008",
                                          "a9603008.dat"),
                         var_names = c("q43a", "q44", "weight"),
                         var_cards = c(6, 6, 1),
                         var_positions = c(62, 64, 13),
                         var_widths = c(1, 1, 3),
                         total_cards = 7,
                         card_pattern = "(?<=^.{10})\\\\d", 
                                        # (a digit, preceded by the start of the line
                                        # and ten other characters)
                         respondent_pattern = "(?<=^\\\\s{2})\\\\d{4}")
                                       # (# four digits, preceded by the start of the line
                                       # and two whitespace characters)
}

}
