Interactive Analyses
learning-objectives
Know how to request VICE access.
Know how to launch and access VICE application.
Understand how to configure the VICE application and run an analysis.
Save the outputs of a VICE application to your Data Store.
Description:
The Visual Interactive Computing Environment (VICE) allows you to work with popular interactive data science applications such JupyterLab, RStudio, Linux shell and others. In this exercise we will cover a simple introductory use case that allows us to complete our goal of visualizing a phylogenetic tree.
In this exercise we will:
Launch an RStudio session, loading the sequence alignments created earlier in the course.
Install an R package and create a phylogenetic tree from the alignment, saving it to a file.
Save our work to the Data Store and terminate the application.
Tip
Why use VICE?
The Discovery Environment excels at running compute intensive analyses non- interactively. In other words, once you launch a job in the DE, you get an output, but to start a new analysis (for example to tweak parameters), you need to relaunch that job, and await new results. This style of computing allows you to run large jobs that require lots of resources. However, several analyses we’d like to do are interactive – we need to visualize and manipulate parameters on the fly – for example, creating a figure where you need to see and adjust the results of an upstream analysis. This kind of work is often done using tools like R and RStudio, or other programing tools such as Jupyter. Hence VICE!
Input Data:
Input |
Description |
Example |
---|---|---|
The output of a multiple species DNA sequence alignment from MUSCLE |
We will use a multiple alignment generated in the Discovery Environment by the MUSCLE App in the previous sections to generate a phylogenetic tree using some tools in R. |
View the example MUSCLE output folder. |
Getting VICE Access
To minimize inappropriate use, VICE is a restricted service, currently accessible from CyVerse US. You must request access to use.
Visit the CyVerse User Portal and access the services panel; look for DE – VICE and select the REQUST ACCESS link.
Tip
Ensure that your CyVerse account is associated with an ORCID and a valid email address from an organization (i.e., .org, an educational institution with the .edu ending, or a government .gov). We will not grant access to commercial email addresses, e.g., @gmail.com @yahoo.com @msn.com etc.
Launching a VICE application
If necessary, log into the CyVerse Discovery Environment.
Use this quicklaunch link
or click on
(Apps icon) to launch the Rocker RStudio Latest App. You can also use the DE search bar to search for this application in the Apps category.
Tip
We provide and maintain the latest versions of R and RStudio made available by the Rocker Project.
Launch the application and adjust the following:
Under “Analysis Info”, add comments if desired; click Next;
For “Parameters”, under “Input Folder” click Browse and navigate to
/iplant/home/shared/cyverse_training/cyverse_mooc/muscle_3_8_31/02_muscle_output
we will use this entire folder as input for our project. Click Select Current Folder; then Next;
Click Next to skip Advanced Settings;
Click Launch Analysis to launch your application
At this point you will be redirected to the Analyses menu. Your application will be listed as “Submitted” for a few minutes (usually just a few, but more depending on both the size of the application software and any imported datasets).
When the Status of the launch is Running, click on the
(link out Icon); a new tab where your VICE application will run should open in your browser.
Tip
Even once the application is in the Running status, you may still have to wait some additional time if data is being transferred
Completing our analysis in R
Once you have your RStudio session, it will largely behave as would an RStudio session running on your local Desktop. Some potential benefits of running RStudio in VICE include more processing power (especially if you choose additional resources at launch – see the Advanced Settings). Since this session is running on CyVerse hardware, transferring large data will also happen at increased speed. To complete our analyses, we will install the ape package and compute a phylogenetic tree.
Tip
While you don’t need to be an expert R user to complete this section, familiarity with R will help since we won’t be going into specific detail about this example.
Tip
The data we loaded at launch of the VICE application will be in the ‘work’ directory at ‘home’.
From the R console, we will do the following commands:
# install and load the needed R library install.packages("ape") library(ape) #Read in the aligned DNA fasta file alignment <- read.FASTA ("~/work/02_muscle_output/fasta.aln", type="DNA") # Create a distance matrix for the sequences dist_mtrx <- dist.dna(alignment) #Compute a neighbor-joining tree nj_tree <- nj(dist_mtrx) # plot the tree plot.phylo(nj_tree) # save the tree to a file write.tree(nj_tree, file = "~/work/tree.newick")
You should have visualized the resulting tree and also created the file ‘tree.newick’ in your work directory
Terminating your VICE session and saving work to the Data Store
Once you have completed your work, you can save your work to the Data Store and terminate your VICE application.
Tip
VICE applications typically have a 48-hour run time. Unless you request an extension, your application will automatically save outputs. It is recommended that you save your work to the Data Store before time expires.
In the Analyses pane of the Discovery Environment, select your running RStudio VICE application.
Under More Actions, select Terminate; confirm Termination on the pop-up notice.
When the VICE application has the status completed, click the folder icon to view the folder on your data store where results will be written. It may take time for all outputs to be saved depending on the size of the data generated.
Tip
You don’t have to terminate your analyses to save your work to the Data Store. From within the RStudio environment using the terminal, you can use iCommands to transfer data (See Data Store Guide on iCommands). RStudio itself allows you to download files and plots directly to your local computer. Use the Export features present in the file pane.
Output/Results
Output |
Description |
Example |
---|---|---|
‘tree.newick’ |
A Newick-formatted phylogenetic tree file which can visualized using your choice of tools. |
Description of output and results
Self Assessment Questions
Question
Q1. Question
Choice A
Choice B
Choice C
Choice C
Answer
Correct answer is ANSWER
Question
Q2. Question
Choice A
Choice B
Choice C
Choice C
Answer
Correct answer is ANSWER
Question
Q3. Question
Choice A
Choice B
Choice C
Choice C
Answer
Correct answer is ANSWER
Fix or improve this documentation
Search for an answer: CyVerse Learning Center
Ask us for help: click
on the lower right-hand side of the page
Report an issue or submit a change: Github Repo Link
Send feedback: learning@CyVerse.org