WORKSHEET - OVERVIEW


1.1 - ABOUT THE WORKSHEET

Worksheet is a Client-Desktop app provided by the GPRO suite to manage the downstream steps for data integration, prioritisation and knowledge discovery of results from omic annotations. Worksheet is a dynamic grid of columns and rows managed by menu allowing the user to easily manage one or more annotation sets. The software is coupled with the GPRO server infrastructure to call several knowledge databases from where Worksheet extracts the annotations. Worksheet is also able to link fasta files to the annotation file opened by worksheet to mine, filter or prioritise sequences according to the information provided by the annotations.

1.2 - VERSION

The current version of Worksheet is 2.0, which is distributed within the GPRO suite as an installer for Windows 7 (64 bit), a self-extracting disk image for Mac OS X 10.6 or later (64 bit), and a compressed tar file for Linux 2.6 kernel series or later (64 bit).

1.3 - INSTALLING WORKSHEET IN YOUR PC

Worksheet is a Java application that can be easily installed on PCs with at least 2GB of RAM and that have installed the Java Runtime Environment (Java JDK) version 11 or above.

To check if you already have a JDK installed, open a command line interface and type:

java -version
If you have the java version 11.0_xx, you should see the following message:
$ java -version
jopenjdk 11.0.14.1 2022-02-08
OpenJDK Runtime Environment (build 11.0.14.1+1-Ubuntu-0ubuntu1.20.04)
OpenJDK 64-Bit Server VM (build 11.0.14.1+1-Ubuntu-0ubuntu1.20.04, mixed mode, sharing)
If you get a “Command not found” error message, it means that JRE is not installed.
For installing JRE, go to the official JRE repository here and download the version matching your operating system. Once installed, check again the output of java -version command on a command line interface. Sometimes, although the JRE is installed, it is not set at the path.

To install the Windows version:
Extract the archive using archiver utility program e.g WinRAR. Then browse to the executable file “worksheet.exe” and execute/run it.

To install the Mac version:
Extract the archive to the desired destination using:
$ unzip worksheet.x.y.os.zip
Then browse to the executable binary file “worksheet.app” and execute/run it.

To install the Linux version:
Extract the archive to the desired destination using:
$ unzip worksheet.x.y.os.zip
Then browse to the executable binary file “worksheet” and execute/run it.

NOTE: Take into account that the view of the following manual can change depending on the Operative System used.

1.4 - GETTING FAMILIAR

The layout of Worksheet is based on the following features; “Directory Browser” “Worksheet Space”, “Top Menu” and “Worksheet menus”.

Figure 1: Basic layout of Worksheet. The directory Browser and Worksheet space can be resized as convenience or masked clicking on the icons at the right top corner.


DIRECTORY BROWSER: to set any folder of your PC as directory for storing the files managed by Worksheet.

WORKSHEET SPACE: Central worksheet space to manage annotation files.

MENUS: Worksheet provides two menus, the top menu and the worksheet menu with distinct functions to manage the data. The TOP MENU manages general aspects of Worksheet, which are organized as follows. The TOP MENU manages general aspects as it is organized as follows.

Figure 2: Top menu.

DIRECTORY:

FILE:

WORKSHEET SETTINGS:


PREFERENCES:

HELP:

The WORKSHEET MENU is available at the top of the worksheet space and presents the following utilities.



Figure 3: Worksheet menu.

FILE:

EDIT:


SORTING/FILTERING:

ANNOTATION:

SELECT:


ASSOCIATE DATABASE:

POSTPROCESSING:

STATISTICS:

1.5 - SERVER DEPENDENCIES

Worksheet implements some utilities that call the GPRO server infrastructure. To allow Worksheet to work appropriately you must to install the following requirements

A tutorial for installation of all server dependencies of Worksheet is available here.

1.6 - TESTING THE SERVER CONNECTION

Once the server infrastructure has been installed and configured you must link Worksheet with it. To do this select “Preferences → Pipeline connection settings” of the top menu and type the following in the configuration Dialog:

  1. Your email address: to receive notifications from the server.
  2. Host address: of the server you want to connect to.
  3. Port number: for using ssh. Default is 22.
  4. Username and password for you in the host server.
To test if you are connected with the server click on the button “Test connection settings”. If you are connected you will be noticed as indicated in the figure below.

Figure 4: Pipeline connection settings.


If you otherwise need to connect the server infrastructure via Proxy, Worksheet will require to be configured. You can choose one of the three following methods:
  1. If you do not know the proxy settings, choose the "Use system proxy settings" option to let Worksheet to guess the default proxy settings already configured in your computer.
  2. If you know your proxy settings, you can specify the proxy configuration. This is the preferred option when using a network proxy. User, password and FTP settings are optional. The port for HTTP proxy is usually 8080 by default.
  3. If you have a Proxy Automatic Configuration file (.pac) URL, use it for loading settings automatically from a remote file.

Figure 5: Proxy configuration (only if your server uses proxy).

1.7 - RAM ASSIGNATION TO YOUR PC

The RAM assigned to Worksheet can be modified editing two parameters (Xms and Xmx) in the configuration file named “Worksheet.ini”. In Linux and Windows computers, the “Worksheet.ini” configuration file is located inside Worksheet app folder. On macOS computers, this configuration file can be found by right-clicking on Worksheet.app → Show package contents → Contents → MacOS → Worksheet.ini.

Within the “Worksheet.ini” file, Xms and Xmx parameters look like this:

- Xms1024m (Minimum allocated memory)
- Xmx2048m (Maximum allocated memory)
The values correspond to the assigned RAM in Megabytes. To modify the RAM assigned to Worksheet, just change these values. Please keep in mind that the limitation depends on the amount of RAM of your computer. E.g. if your computer has 8GB of RAM, it is recommended to assign Xms2048m and Xmx4096m for a better performance. Xmx can be even increased up to Xmx6144m.but please avoid to use RAM values near to the maximum available memory of your PC as this election might provoke unstability in computer’s operating system.

1.7 - INPUT FILES: SINGLEHIT/MULTIHIT

The input file managed by Worksheet is a plain file with sequences and their annotations, normally a CSV file although Worksheet is also able to read plain files in any format (txt, gff, gtf etc). Rows refer to the query sequences while columns refer annotation sources/items. An example of CSV file typically accepted by Worksheet can be downloaded here.

When you try to open a csv file with Worksheet a dialog appears (Figure 6) allowing you to tell the worksheet interpreted which format of column separator (spaces, commas, semicolons) is your file. You can also tell the interpreter to manage decimal separators as commas or as dots as well as to open the csv as a multihit or as a single hit file (option “Worksheet contains multiple high scoring pairs (HSP) for each match”).

Figure 6: Opening a csv file with Worksheet.


The multihit/singlehit option useful if you have multiple annotations per query (from a BLAST search for instance). The difference between the multihit and singlehit modes is that if you choose the later Worksheet will show only the best hit per query. If you select the multihit option Worksheet will split the CSV in two files: one of these being a singlehit file that will keep the original name while the other file will be the multihit file that will be entitled as _multihit.csv. Then, Worksheet will open the singlehit file but coupled with the multihit file, permitting you to visualize in a separate dialog all detected hits per sequence and change the current annotation of that sequence by another (if appropriate) using the mouse as indicated in the next section below “Mouse Dialogs”. By default, the aforesaid option remains unselected thus meaning that Worksheet will open by default the CSV as a singlehit file.

1.9 - MOUSE ACTIONS

Worksheet includes some mouse-dependent utilities that can be used just positioning and clicking the mouse in the corresponding place. By right clicking on the directory space a dialog will appear providing the typical actions to create and manage files/folders/worksheets. You can also cut, copy, paste, delete and rename files.

Figure 7: Mouse-dependent options and tricks.


As shown in Figure 7, by selecting and left clicking on any row cell (corresponding to a sequence), it will appear a note field editable section for you to add information regarding the selected sequence. All Worksheet cells (including column names) can be edited with the mouse. By right clicking on the Worksheet space, a dialog will appear providing two options: Columns and Rows. The first one offers a sub-dialog with additional actions for adding, selecting, removing, renaming or joining and splitting columns. The second permits to do the same and other actions concerning rows. For instance, if you are dealing with a multihit CSV file you can place the mouse in one row, make right click > Rows > Show BLAST multihit and visualize the set of alternative best hits for your sequence annotation and switch it (if appropriate) by another selected from this summary. You can also view the sequence details by accessing the note field of any particular sequence (right clicking directly on the Worksheet grid).priate) by another selected from this summary; b) View sequence details: this is another way to access the note field of any particular sequence (also accessible right clicking directly on the Worksheet grid).






GPRO licensing and Usage           Former versions

Biotechvana


Valencia Lab
Parc Cientific Universitat de Valencia
Carrer del Catedràtic Agustín Escardino, 9. 46980 Paterna (Valencia) Spain
Madrid Lab
Parque Científico de Madrid
Campus de Cantoblanco
Calle Faraday 7, 28049 Madrid Spain
Contact us
Phone: +34 960 06 74 93
Email: biotechvana@biotechvana.com

Supported by


Hipra Scientific S.L.U, Polypeptide Therapeutic Solutions S.L., Biotechvana S.L. and Nostrum Biodiscovery constitute the consortium of enterprises participating in the project "Research of a new vaccine for a human respiratory disease", granted by the CDTI (Center for Industrial Technological Development), and supported by the Ministry of Science and Innovation and financed by the European Union – NextGenerationEU. The main objective of this project is to design a safe immunogenic and effective vaccine against the respiratory syncytial virus.

Biotechvana © 2015
Privacy policy
Política de privacidad
This website use cookies, by continuing to browse the site you are agreeing to our use of cookies. More info about our cookies here.