make reference fasta from UCSC


StepAnnotation
Step 1: Input dataset
select at runtime
this a fasta sequence file from UCSC . Can filter on 'type' 'o 'if you just want proper chromosomes
Step 2: Input dataset
select at runtime
table from UCSC use out put format :select fields from primary and related tables . check 'chrom'
Step 3: FASTA-to-Tabular
Output dataset 'output' from step 1
1
0
Step 4: Remove beginning
Not available.
Output dataset 'output' from step 2
removes first lines of table as they are not needed 1 or 2
Step 5: Paste
Output dataset 'out_file1' from step 4
Output dataset 'output' from step 3
Tab
all new chromo column
Step 6: Cut
c1,c3
Tab
Output dataset 'out_file1' from step 5
cut out old description
Step 7: Cut
c1,c2
Tab
Output dataset 'out_file1' from step 5
just to check that new column lines up
Step 8: Sort
Output dataset 'out_file1' from step 6
Not available.
Alphabetical sort
Ascending order
Column selections
0

No value found for 'Number of header lines to skip'. Using default: '0'.

sort on column 1
Step 9: Tabular-to-FASTA
Output dataset 'out_file1' from step 8
1
2
Step 10: FASTA Width
Output dataset 'output' from step 9
60
This is the ref file
Step 11: Compute sequence length
Output dataset 'output' from step 10
0
This is just to check what you have in the ref file