Last updated: 2025-04-09
Checks: 6 1
Knit directory: GradLog/
This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of
the R Markdown file created these results, you’ll want to first commit
it to the Git repo. If you’re still working on the analysis, you can
ignore this warning. When you’re finished, you can run
wflow_publish to commit the R Markdown file and build the
HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20201014) was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version d380601. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish or
wflow_git_commit). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.DS_Store
Ignored: analysis/.Rhistory
Unstaged changes:
Modified: analysis/Log2024_new_beginning.Rmd
Modified: analysis/week_log.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/week_log.Rmd) and HTML
(docs/week_log.html) files. If you’ve configured a remote
Git repository (see ?wflow_git_remote), click on the
hyperlinks in the table below to view the files as they were in that
past version.
| File | Version | Author | Date | Message |
|---|---|---|---|---|
| html | ff0bca7 | liliw-w | 2024-04-24 | Build site. |
| html | 9fd6e5f | liliw-w | 2024-04-24 | Build site. |
| Rmd | 8173962 | liliw-w | 2024-04-24 | wflow_publish("analysis/week_log.Rmd") |
| html | 151bee4 | liliw-w | 2024-04-22 | Build site. |
| Rmd | b7a0bd9 | liliw-w | 2024-04-22 | test toc render |
| html | 423f69e | liliw-w | 2024-04-22 | Build site. |
| html | 1cf9a35 | liliw-w | 2024-04-22 | Build site. |
| Rmd | 2448521 | liliw-w | 2024-04-22 | rename |
| html | ae55f7a | liliw-w | 2024-04-22 | test toc render |
| Rmd | f595c33 | liliw-w | 2024-04-22 | rename |
| html | 76794f9 | liliw-w | 2024-02-20 | Build site. |
| Rmd | cd27671 | liliw-w | 2024-02-20 | test |
| html | 2b44f3b | liliw-w | 2024-02-20 | Build site. |
| Rmd | ab953a4 | liliw-w | 2024-02-20 | test |
| html | 79511af | liliw-w | 2024-02-20 | Build site. |
| Rmd | b4b7969 | liliw-w | 2024-02-20 | test |
| html | 355d984 | liliw-w | 2024-02-20 | Build site. |
| Rmd | d469ed5 | liliw-w | 2024-02-20 | test |
| html | 387ac23 | liliw-w | 2024-02-20 | Build site. |
| Rmd | 85299ae | liliw-w | 2024-02-20 | test |
| html | d8c2606 | liliw-w | 2024-02-20 | Build site. |
| Rmd | 26506ca | liliw-w | 2024-02-20 | test |
| html | c775adf | liliw-w | 2024-02-20 | Build site. |
| Rmd | f3d506d | liliw-w | 2024-02-20 | test |
| html | 8d7f0d8 | liliw-w | 2024-02-20 | Build site. |
| Rmd | 9b5c76e | liliw-w | 2024-02-20 | test |
| html | 46b8cba | liliw-w | 2024-02-20 | Build site. |
| Rmd | 04c5671 | liliw-w | 2024-02-20 | test |
Get familiar with and set up cluster.
Looked at the serial biopsies data.
GENOMIC_SPECIMEN.csvEach row is a sample. Multiple samples can come from one patient. These samples are profiled in different time points.
One patient’s samples can be tested by multiple TEST_TYPE (3%).
Only consider patient specimens tested with OncoPanel from
PROFILECOHORT, by filtering
TEST_TYPE == 'ONCOPANEL_PROFILECOHORT'.
There are 27,148 specimen, 26,307 unique patients, 773 patients have their multiple samples tested.
719 patients have been tested twice. The other patients were tested 3 to 6 times.
I only keep specimen samples from patients that were sampled twice, corresponding to 1438 sample left. [This corresponds to what we mentioned about the data - “~1000 patients”, “sequenced twice”]
Figure: Number of samples each patient has.
IS_MATCHED_FLG (normal or tumor sample) and
REPORT_DT (Date this report was signed out by the
pathologist), which results in four categories.| patient_category | n_patients |
|---|---|
| normal−normal | 13 |
| normal−tumor | 1 |
| tumor−normal | 14 |
| tumor−tumor | 691 |
Figure: Number of patients belong to each catogory.
Questions -
No flag SAMPLE_COLLECTION_DT to record accurate date when sample was collected first.
Does “tumor-tumor” mean two samples were collected before and after treatment?
Why normal samples have non zero tumor purity? Some samples also have negative tumor purity.
Figure: Normal samples have non zero tumor purity.
Figure: TUMOR_PURITY changes across two biopsies of each patient belonging to four catogories.
GENOMIC_MUTATION_RESULTS.csvUse samples from above
Only SNP variants
Keep only patients that have two samples across mutation changes (some patients have only one sample with mutation info)
Resulting in 650 patients
Figure: Number of patients belong to each catogory.
ALLELE_FRACTION - Fraction of reads for the observed allele.
How many SNPs does each patient have?
Figure: Number of SNP varaints each patient has.
Figure: ALLELE_FRACTION change across patients’ two biopsies for SNPs.
Questions -
Set up Eris cluster
Can’t login after creating account? Wait 24 hours for the account to be activated.
Can’t submit job to slurm? Permission issue, contact help team.
Access to rstudio server and jupyter notebook
Username is lower case
Incorrect username or password? Contact help team to restart the server.
Things to double check -
Results put in /data/gusev/USER/llw?
Dropbox lab access?
MPG weekly meeting, location?
Download Doug’s data & upload to cluster
R version 4.2.3 (2023-03-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur ... 10.16
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.0 dplyr_1.1.3
[5] purrr_1.0.2 readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
[9] ggplot2_3.4.3 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 xfun_0.39 bslib_0.5.0 colorspace_2.1-0
[5] vctrs_0.6.3 generics_0.1.3 htmltools_0.5.5 yaml_2.3.7
[9] utf8_1.2.3 rlang_1.1.1 jquerylib_0.1.4 later_1.3.1
[13] pillar_1.9.0 glue_1.6.2 withr_2.5.1 lifecycle_1.0.3
[17] munsell_0.5.0 gtable_0.3.4 workflowr_1.7.0 evaluate_0.21
[21] knitr_1.43 tzdb_0.4.0 fastmap_1.1.1 httpuv_1.6.11
[25] fansi_1.0.4 highr_0.10 Rcpp_1.0.11 promises_1.2.0.1
[29] scales_1.2.1 cachem_1.0.8 jsonlite_1.8.7 fs_1.6.2
[33] hms_1.1.3 digest_0.6.33 stringi_1.7.12 rprojroot_2.0.3
[37] grid_4.2.3 cli_3.6.1 tools_4.2.3 magrittr_2.0.3
[41] sass_0.4.6 whisker_0.4.1 pkgconfig_2.0.3 timechange_0.2.0
[45] rmarkdown_2.23 rstudioapi_0.15.0 R6_2.5.1 git2r_0.32.0
[49] compiler_4.2.3