% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/plot_read_quality.R
\name{plot_read_quality}
\alias{plot_read_quality}
\title{Plot read length vs. read quality}
\usage{
plot_read_quality(
  fastq_input,
  use_ee_rate = FALSE,
  plot_title = TRUE,
  alpha = 0.5
)
}
\arguments{
\item{fastq_input}{(Required). A FASTQ file path or FASTQ object containing
reads. See \emph{Details}.}

\item{use_ee_rate}{(Optional). If \code{TRUE}, the plot will display the
expected error rate (EE) on the y-axis instead of the mean quality score.
Defaults to \code{FALSE}.}

\item{plot_title}{(Optional). If \code{TRUE} (default), a title will be
displayed in the plot. The title will either be "Read length vs Expected
error rate (EE) of read" or "Read length vs Average quality score of read",
depending on \code{use_ee_rate}. Set to \code{FALSE} for no title.}

\item{alpha}{(Optional). The transparency level of the points in the scatter
plot. Defaults to \code{0.5}.}
}
\value{
A ggplot2 object displaying the scatter plot with marginal histograms.
}
\description{
Generates a scatter plot visualizing the relationship between read length and
read quality. The y-axis can display either the mean quality score per read
or the expected error (EE) rate. Marginal histograms are included to show the
distribution of read lengths and quality metrics.
}
\details{
This function visualizes the relationship between read length and read
quality. The user can choose to plot either the
mean quality score per read or the expected error (EE) rate.

\code{fastq_input} can either be a file path to a FASTQ file or a FASTQ
object. FASTQ objects are tibbles that contain the columns \code{Header},
\code{Sequence}, and \code{Quality}, see \code{\link[microseq]{readFastq}}.

The EE rate is calculated as the mean of error probabilities per read, where
the error probability for each base is computed as \eqn{10^{(-Q/10)}} from
Phred scores. A lower EE rate indicates higher sequence quality, while a
higher EE rate suggests lower confidence in the read.

Marginal histograms are added to display the distribution of read lengths
(top) and quality scores or EE rates (right).

If \code{fastq_input} contains more than 10 000 reads, the function will
randomly select 10 000 rows for downstream calculations. This subsampling is
performed to reduce computation time and improve performance on large
datasets.
}
\examples{
# Define arguments
fastq_input <- system.file("extdata/small_R1.fq", package = "Rsearch")

# Generate and display scatter plot with mean quality score on y-axis
p1 <- plot_read_quality(fastq_input = fastq_input)
print(p1)

# Generate and display scatter plot with mean quality score on y-axis
p2 <- plot_read_quality(fastq_input = fastq_input,
                        use_ee_rate = TRUE)
print(p2)

}
