Skip to main content

File Service FAQ

How do I add files to PHC?

Access the file service and load files into PHC through the following methods:

What is the best method for bulk file uploading?

Easily upload a large number of files with the following methods:

What is the best method to transfer large individual files (larger than 500 GB)?

For the majority of use-cases, the LifeOmic CLI transfers files successfully and provides an easy to use terminal experience.

But when you use the LifeOmic CLI to upload very large files, you can encounter issues stemming from a slow internet uplink speed, excessive time required for the upload, or an interrupted connection.

Configuring SFTP to a Project is recommended when transferring files of this size. SFTP can overcome some of the issues you might encounter using the CLI. SFTP can also resume a file transfer for a file that's been partially transferred.

Other good methods are the PHC SDK for Python and Transfer Amazon S3 Files with PHC Web Console. But both of these methods require advanced knowledge and more extensive configuration than SFTP.

How are my files protected?

LifeOmic File Services encrypts data in transit between the client and server. You do not need to encrypt your files before uploading them. Data is also encrypted at rest in storage. LifeOmic File Services use strong encryption and key management.

What is LifeOmic File Service?

LifeOmic File Service is a managed service for storing and retrieving file data.

Once files are uploaded, they are available to the other PHC services for use and analysis.

  • Omics Explorer is an example where VCF files may be uploaded, and once indexed, will allow querying across genetic variants (SNV, CNV, Fusion) in real-time in the PHC Web Console.

  • Task Service and Notebook Service are two examples where files may be brought in for analysis/compute with your own code.

Common file data examples uploaded are:

  • Genomic Variants, Gene expression, Proteomics, Pharmacogenetics - file formats such as VCF, BAM, CSV, TSV
  • Documents, Images, and Audio - file formats such as JPEG, PNG, DICOM, PDF, etc.

What parts of PHC use the LifeOmic File Service?

All parts of the PHC that store and retrieve file data rely on the LifeOmic File Service.

How are files organized? What is a project (aka data-set)?

The PHC platform organizes data, including files, under user-defined projects (aka data-sets).

Inside projects, files may be organized further into directory structures so files are not one large file list.

Does deleting a project (aka data-set) delete all associated files?

Yes, deleting a project deletes the project and all associated data.

After you select Delete this Project, the project stays active for 14 days before the files are actually deleted. You can cancel the deletion during this time period. You also have the option to delete the files immediately.

What access control is available?

Projects allow for access control to be put in place by the organization.

Application level access-controls are enforced when viewing, downloading, and uploading files.

Example: A user can be configured to query and search genetic data for a subject, but be restricted in the ability to download the subject's file(s) on a per project basis.

Access control refers to the ability to control who can interact with a resource within the platform. The LifeOmic platform uses Attribute Based Access Control (ABAC) to assign different attributes and dictate what information users have access to in cases requiring complex Access Control. For more information, see the Account Management Overview.

How durable is LifeOmic File Service?

LifeOmic File Service is powered by AWS S3. For more information see AWS S3 FAQ.

Are my files backed up?

Cross-region replication and backups are included as part of service.

Can arbitrary files be stored without a subject/patient?

The only required information to store a file is a project identifier.

Files are not required to be tied to a specific subject within a project.

If this link is desired, a FHIR DocumentReference can be used to provide this link using LifeOmic FHIR Service.

How can I get started uploading data?

The [LifeOmic CLI][cli] is the best option to get started uploading file data to the PHC. Once installed, files can be uploaded with the lo files --help command.

What limits are in place for LifeOmic File Service?

The total amount of files one can store is unlimited. The maximum file size is 5 terabytes.

The [LifeOmic CLI][cli] can manage uploading this amount of data to the PHC.

What file name restrictions are in place?

File names must match, ^([a-zA-Z0-9!\\-+.*'()&$@=;+ ,?:_/]*)$, and be less than 970 characters in length.

What files can be viewed within the PHC Web Console?

Common file types found on the web like text, images, and markdown will be viewable within the PHC Web Console.

Certain file types open into web-based viewers. Some examples are csv/tsv files, DICOM images, ipynb notebook files, and PDF files.

Files larger than 5 MB will be opened in a new tab.

How can I document my files?

Placing a README.md in a directory of files will render the markdown below the file listing as an inline description.

How can I reference my file?

A unique identifier is created for all files uploaded to the PHC. To make a reference to a file, use a LifeOmic Resource Name (LRN) which will remain as a stable pointer to this file. Future file renames will not break the LRN reference.

How can I share my project with collaborators?

Users may create sharable links (URL) from the PHC Web Console for those who have access to the project.

Granting access to projects is available through access control and ABAC.

How can I trigger automation to run against the recent set of file transfers?

Users may define file glob patterns that may be used to trigger and start File Actions. File Actions allow one to automate the execution of behavior with Common Workflow Language (CWL).