% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/dimensional_model_define_fact.R
\name{define_fact}
\alias{define_fact}
\alias{define_fact.dimensional_model}
\title{Define facts in a \code{dimensional_model} object}
\usage{
define_fact(
  st,
  name = NULL,
  measures = NULL,
  agg_functions = NULL,
  nrow_agg = "nrow_agg"
)

\method{define_fact}{dimensional_model}(
  st,
  name = NULL,
  measures = NULL,
  agg_functions = NULL,
  nrow_agg = "nrow_agg"
)
}
\arguments{
\item{st}{A \code{dimensional_model} object.}

\item{name}{A string, name of the fact.}

\item{measures}{A vector of measurement names.}

\item{agg_functions}{A vector of aggregation function names. If none is
indicated, the default is SUM. Additionally they can be MAX or MIN.}

\item{nrow_agg}{A string, measurement name for the number of rows aggregated.}
}
\value{
A \code{dimensional_model} object.
}
\description{
To define facts in a \code{dimensional_model} object, the essential data is a name
and a set of measurements that can be empty (does not have explicit
measurements). Associated with each measurement, an aggregation function is
required, which by default is SUM.
}
\details{
To get a star schema (a \code{star_schema} object) we need a flat table
(implemented through a \code{tibble}) and a \code{dimensional_model} object. The
definition of facts in the \code{dimensional_model} object is made from the flat
table column names. Using the \code{dput} function we can list the column names of
the flat table so that we do not have to type their names.

Associated with each measurement there is an aggregation function that can be
SUM, MAX or MIN. Mean is not considered among the possible aggregation
functions: The reason is that calculating the mean by considering subsets of
data does not necessarily yield the mean of the total data.

An additional measurement corresponding to the number of aggregated rows is
always added which, together with SUM, allows us to obtain the mean if
needed.
}
\examples{
library(tidyr)

# dput(colnames(mrs_age))
#
# c(
#   "Reception Year",
#   "Reception Week",
#   "Reception Date",
#   "Data Availability Year",
#   "Data Availability Week",
#   "Data Availability Date",
#   "Year",
#   "WEEK",
#   "Week Ending Date",
#   "REGION",
#   "State",
#   "City",
#   "Age Range",
#   "Deaths"
# )

dm <- dimensional_model() \%>\%
  define_fact(
    name = "mrs_age",
    measures = c("Deaths"),
    agg_functions = c("SUM"),
    nrow_agg = "nrow_agg"
  )

dm <- dimensional_model() \%>\%
  define_fact(
    name = "mrs_age",
    measures = c("Deaths")
  )

dm <- dimensional_model() \%>\%
  define_fact(name = "Factless fact")

}
\seealso{


Other star definition functions: 
\code{\link{define_dimension}()},
\code{\link{dimensional_model}()}
}
\concept{star definition functions}
