% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cutData.R
\name{cutData}
\alias{cutData}
\title{Function to split data in different ways for conditioning}
\usage{
cutData(
  x,
  type = "default",
  names = NULL,
  suffix = NULL,
  hemisphere = "northern",
  n.levels = 4,
  start.day = 1,
  is.axis = FALSE,
  local.tz = NULL,
  latitude = 51,
  longitude = -0.5,
  ...
)
}
\arguments{
\item{x}{A data frame containing a field \code{date}.}

\item{type}{A string giving the way in which the data frame should be split.
Pre-defined values are: \code{"default"}, \code{"year"}, \code{"hour"}, \code{"month"},
\code{"season"}, \code{"weekday"}, \code{"site"}, \code{"weekend"}, \code{"monthyear"},
\code{"daylight"}, \code{"dst"} (daylight saving time).

\code{type} can also be the name of a numeric or factor. If a numeric column
name is supplied \code{\link[=cutData]{cutData()}} will split the data into four quantiles.
Factors levels will be used to split the data without any adjustment.}

\item{names}{By default, the columns created by \code{\link[=cutData]{cutData()}} are named after
their \code{type} option. Specifying \code{names} defines other names for the columns,
which map onto the \code{type} options in the same order they are given. The
length of \code{names} should therefore be equal to the length of \code{type}.}

\item{suffix}{If \code{name} is not specified, \code{suffix} will be appended to any
added columns that would otherwise overwrite existing columns. For example,
\code{cutData(mydata, "nox", suffix = "_cuts")} would append a \code{nox_cuts} column
rather than overwriting \code{nox}.}

\item{hemisphere}{Can be \code{"northern"} or \code{"southern"}, used to split data
into seasons.}

\item{n.levels}{Number of quantiles to split numeric data into.}

\item{start.day}{What day of the week should the \code{type = "weekday"} start on?
The user can change the start day by supplying an integer between 0 and 6.
Sunday = 0, Monday = 1, ... For example to start the weekday plots on a
Saturday, choose \code{start.day = 6}.}

\item{is.axis}{A logical (\code{TRUE}/\code{FALSE}), used to request shortened cut
labels for axes.}

\item{local.tz}{Used for identifying whether a date has daylight savings time
(DST) applied or not. Examples include \code{local.tz = "Europe/London"},
\code{local.tz = "America/New_York"}, i.e., time zones that assume DST.
\url{https://en.wikipedia.org/wiki/List_of_zoneinfo_time_zones} shows time
zones that should be valid for most systems. It is important that the
original data are in GMT (UTC) or a fixed offset from GMT.}

\item{latitude, longitude}{The decimal latitude and longitudes used when \code{type = "daylight"}. Note that locations west of Greenwich have negative
longitudes.}

\item{...}{All additional parameters are passed on to next function(s).}
}
\value{
Returns the data frame, \code{x}, with columns appended as defined by
\code{type} and \code{name}.
}
\description{
Utility function to split data frames up in various ways for conditioning
plots. Widely used by many \code{openair} functions usually through the option
\code{type}.
}
\details{
This section give a brief description of each of the define levels of \code{type}.
Note that all time dependent types require a column \code{date}.
\itemize{
\item \code{"default"} does not split the data but will describe the levels as a date
range in the format "day month year".
\item \code{"year"} splits the data by each year.
\item \code{"month"} splits the data by month of the year.
\item \code{"hour"} splits the data by hour of the day.
\item \code{"monthyear"} splits the data by year and month. It differs from month in
that a level is defined for each month of the data set. This is useful
sometimes to show an ordered sequence of months if the data set starts half
way through a year; rather than starting in January.
\item \code{"weekend"} splits the data by weekday and weekend.
\item \code{"weekday"} splits the data by day of the week - ordered to start Monday.
\item \code{"season"} splits data up by season. In the northern hemisphere winter =
December, January, February; spring = March, April, May etc. These
definitions will change of \code{hemisphere = "southern"}.
\item \code{"seasonyear"} (or \code{"yearseason"}) will split the data into year-season
intervals, keeping the months of a season together. For example, December
2010 is considered as part of winter 2011 (with January and February 2011).
This makes it easier to consider contiguous seasons. In contrast, \code{type = "season"} will just split the data into four seasons regardless of the year.
\item \code{"daylight"} splits the data relative to estimated sunrise and sunset to
give either daylight or nighttime. The cut is made by \code{cutDaylight} but more
conveniently accessed via \code{cutData}, e.g. \code{cutData(mydata, type = "daylight", latitude = my.latitude, longitude = my.longitude)}. The daylight estimation,
which is valid for dates between 1901 and 2099, is made using the measurement
location, date, time and astronomical algorithms to estimate the relative
positions of the Sun and the measurement location on the Earth's surface, and
is based on NOAA methods. Measurement location should be set using \code{latitude}
(+ to North; - to South) and \code{longitude} (+ to East; - to West).
\item \code{"dst"} will split the data by hours that are in daylight saving time (DST)
and hours that are not for appropriate time zones. The option also requires
that the local time zone is given e.g. \code{local.tz = "Europe/London"},
\code{local.tz = "America/New_York"}. Each of the two periods will be in
\emph{local time}. The main purpose of this option is to test whether there
is a shift in the diurnal profile when DST and non-DST hours are compared.
This option is particularly useful with the \code{\link[=timeVariation]{timeVariation()}} function. For
example, close to the source of road vehicle emissions, "rush-hour" will tend
to occur at the same \emph{local time} throughout the year, e.g., 8 am and 5 pm.
Therefore, comparing non-DST hours with DST hours will tend to show similar
diurnal patterns (at least in the timing of the peaks, if not magnitude) when
expressed in local time. By contrast a variable such as wind speed or
temperature should show a clear shift when expressed in local time. In
essence, this option when used with \code{timeVariation()} may help determine
whether the variation in a pollutant is driven by man-made emissions or
natural processes.
\item \code{"wd"} splits the data by 8 wind sectors and requires a column \code{wd}: "NE",
"E", "SE", "S", "SW", "W", "NW", "N".
}

Note that all the date-based types, e.g., \code{"month"}/\code{"year"} are derived from
a column \code{date}. If a user already has a column with a name of one of the
date-based types it will not be used.
}
\examples{
## split data by day of the week
mydata <- cutData(mydata, type = "weekday")
names(mydata)
head(mydata)
}
\author{
David Carslaw

Karl Ropkins (\code{"daylight"} option)
}
