Sample data

In order to use ForiC, you will need a fasta file. For example purposes, you can use the provided file for Mycoplasma gallisepticum VA94_7994-1-7P. If needed, please download file CP003506.fna from the following link:

Download Sample File (955kb)

Upload the genome file in fasta format

In order to use ForiC, users should select and upload a genome file in FASTA format. No additional user input is needed, since ForiC uses default values for optional settings
  • Window size: This parameter allows for the selection of the window size used for calculations. The default window size is set to 1% of the genome length, but it can be adjusted within a range from 0.1% to 5.0%, with increments of 0.1%.
  • Prediction interval length: By default, the interval length is set to one window size. For higher accuracy, this can be adjusted to two window sizes. If the prediction region falls within the first window of the genome, ForiC will always use a one-window size length.
If more information is needed regarding the preselected preferences of ForiC, please refer to article (currently in the process of submission).

Prevalidation filters

This section defines the criteria and thresholds used to determine whether a sequence proceeds further in the analysis or is excluded at this stage.

Knockout Criteria

If any of the knockout conditions is met, the sequence will be excluded, and the program will stop processing it. These criteria ensure that sequences not meeting basic requirements are filtered out early.
  • GC Knockout (%): Specify the minimum GC content percentage. If the sequence's GC content falls below this value, it will be excluded from further analysis. Example: Enter 10 to exclude sequences with GC content below 10%.
  • Length Knockout (bp): Define the minimum sequence length in base pairs (bp). If the sequence is shorter than this value, it will not proceed further. Example: Enter 2000 to exclude sequences shorter than 2000 bp.

Validation Thresholds

For the validation process to halt, all thresholds must indicate the absence of an OriC (origin of replication). This ensures the sequence meets all required conditions before excluding it.
  • Min. GC Content Threshold (%): Set the minimum acceptable GC content percentage for validation. Sequences with GC content below this value are flagged. Example: Enter 30 to consider only sequences with GC content above 30%.
  • Entropy Threshold: Define the minimum entropy value, which represents the complexity of the sequence. Lower values indicate less complexity, which might signal invalid sequences. Example: Enter 1.8 to exclude sequences with entropy values below this threshold.
  • GC Skew Threshold: Specify the maximum allowable GC skew value. GC skew measures the imbalance of guanine and cytosine nucleotides in the sequence. Example: Enter 0.05 to include only sequences with a GC skew below this value.
  • Variance Threshold: Enter the maximum allowable variance value for sequence validation. Variance helps identify significant fluctuations in the data. Example: Enter 1,000,000 to exclude sequences with variance exceeding this threshold.