Setting up the JSON project file
Note
You can find a ready-to-use JSON file named
elongator.json
in the tutorial directory and skip directly to the next step.
Create Xlink Analyzer project
Here is the instruction how to create the JSON file from scratch.
First, you need to create XlinkAnalyzer project file for your complex
Note
XlinkAnalyzer is used here as a graphical interface for input preparation in Assembline.
Does not matter if you do not have crosslinks - we use XlinkAnalyzer to prepare the input file for modeling.
Open Xlink Analyzer window:
In the Xlink Analyzer project Setup tab, define subunits using the menu on the left. For each subunit, enter the name, the chain ID or comma-separated multiple IDs, define the color, and click Add button.
Click on the Domains button and define domains of Elp1 in the window that opens - they will be used later for adding restraints:
Load sequence data using the panel on the right in the Setup tab.
For this, prepare a file with sequences of all proteins in a single file in FASTA format
Here, use the
elp_sequences.fasta
file provided in the tutorial materials.Upload the file using the Browse button, enter a name (e.g. “sequences”) and select “sequence” type in the drop down menu.
Click Add button.
Map the sequence names to names of subunits by clicking the Map button and selecting the subunits in a window that opens. After the mapping, click on the “check” button that turns red.
The result should look like this:
Load crosslink data using the same panel on the left in the Setup tab.
The crosslink files need to be provided in Xlink Analyzer or xQuest format.
Here, the files in xQuest format have been prepared in the tutorial materials:
xlinks/ DSS/ inter_run3_190412.clean_strict.csv intra_run3_190412.clean_strict.csv loop_run3_190412.clean_strict.csv mono_run3_190412.clean_strict.csv sg1-inter.clean_strict.csv sg1-intra.clean_strict.csv sg1-loop.clean_strict.csv sg2-3-inter.clean_strict.csv sg2-3-intra.clean_strict.csv sg2-3-loop.clean_strict.csv sg2-3-mono.clean_strict.csv DSG/ inter_dsg.clean_strict.csv inter_dsg_repeat.clean_strict.csv intra_dsg.clean_strict.csv intra_dsg_repeat.clean_strict.csv loop_dsg.clean_strict.csv loop_dsg_repeat.clean_strict.csv mono_dsg.clean_strict.csv mono_dsg_repeat.clean_strict.csv
The files contain crosslinking results for two crosslinkers (DSS and DSG) in multiple files (multiple runs, inter/intra/loop/mono crosslinks in different files)
Name the first dataset “DSS” click Browse button and select all CSV files from the
DSS
directory. Set type toxquest
orXlinkAnalyzer
. Repeat for DSG.Map crosslinked protein names to the subunit names using the Map button
After the operations above you should end up with sth like this:
Save the JSON file under a name like
xla_project.json
And make a copy that you will modify for modeling
cp xla_project.json elongator.json
Add modeling information to the project file
Open
elongator.json
in a text editorNote
The project file is in so-called JSON format
While it may look difficult to edit at the first time, it is actually quite OK with a proper editor (and a bit of practice ;-)
We recommend to use a good editor such as:
At this point, the JSON has the following format:
{ "data": [ { "some xlink definition 1" }, { "some xlink definition 2" }, { "sequence file definition" } ], "subunits": [ "subunit definitions" ], "xlinkanalyzerVersion": "..." }
Add symmetry
First, specify the series of symmetry related molecules. Here, each of the three subunits is in two symmetrical copies, so we add series as below:
{ "series": [ { "name": "2fold", "subunit": "Elp1", "mode": "input", "cell_count": 2, "tr3d": "2fold", "inipos": "input" }, { "name": "2fold", "subunit": "Elp2", "mode": "auto", "cell_count": 2, "tr3d": "2fold", "inipos": "input" }, { "name": "2fold", "subunit": "Elp3", "mode": "auto", "cell_count": 2, "tr3d": "2fold", "inipos": "input" } ] "data": [ { "some xlink definition 1" }, { "some xlink definition 2" }, { "sequence file definition" } ], "subunits": [ "subunit definitions" ], "xlinkanalyzerVersion": "..." }
Second, define the coordinates of the symmetry axis:
{ "symmetry": { "sym_tr3ds": [ { "name": "2fold", "axis": [0, 0, -1], "center": [246.39112398, 246.41114644, 248.600000], "type": "C2" } ] }, "series": [ "the series" ], "data": [ { "some xlink definition 1" }, { "some xlink definition 2" }, { "sequence file definition" } ], "subunits": [ "subunit definitions" ], "xlinkanalyzerVersion": "..." }
Add specification of input PDB files
The input structures for the tutorial are in the
in_pdbs/
directory:Elp1.CTD.on5cqs.5cqr.model_ElNemo_mode7.pdb Elp1_NTD_1st_propeller.pdb Elp1_NTD_2nd_propeller.pdb Elp2.pdb
Add them to the JSON like this:
{ "symmetry": { "symmetry axis definition" }, "series": [ "the series" ], "data": [ { "type": "pdb_files", "name": "pdb_files", "data": [ { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp1", "subunit": "Elp1", "domain": "propeller1", "filename": "in_pdbs/Elp1_NTD_1st_propeller.pdb" } ] }, { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp1", "subunit": "Elp1", "domain": "propeller2", "filename": "in_pdbs/Elp1_NTD_2nd_propeller.pdb" } ] }, { "components": [ { "name": "Elp1", "subunit": "Elp1", "serie": "2fold", "copies": [0], "chain_id": "G", "domain": "CTD", "filename": "in_pdbs/Elp1.CTD.on5cqs.5cqr.model_ElNemo_mode7.pdb"}, { "name": "Elp1", "subunit": "Elp1", "serie": "2fold", "copies": [1], "chain_id": "H", "domain": "CTD", "filename": "in_pdbs/Elp1.CTD.on5cqs.5cqr.model_ElNemo_mode7.pdb"} ] }, { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp2", "subunit": "Elp2", "filename": "in_pdbs/Elp2.pdb"} ] }, { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp3", "subunit": "Elp3", "filename": "in_pdbs/Elp3.mono.pdb"} ] } ] }, { "some xlink definition 1" }, { "some xlink definition 2" }, { "sequence file definition" } ], "subunits": [ "subunit definitions" ], "xlinkanalyzerVersion": "..." }
The
foreach_serie
andforeach_copy
indicate the given PDB file specification will be applied to each serie with this subunit and for each copy within the series.All PDB selections within the same
components
block will be grouped into a rigid body, unless a separaterigid_bodies
block is specified andadd_rbs_from_pdbs
is set toFalse
in Setting up the parameter fileAdd pointers to fit libraries
{ "symmetry": { "symmetry axis definition" }, "series": [ "the series" ], "data": [ { "type": "pdb_files", "name": "pdb_files", "data": [ { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp1", "subunit": "Elp1", "domain": "propeller1", "filename": "in_pdbs/Elp1_NTD_1st_propeller.pdb" } ], "positions": "fits/search100000_metric_cam_inside0.6/emd_4151_binned.mrc/Elp1_NTD_1st_propeller.pdb/solutions_pvalues.csv", "positions_type": "chimera", "max_positions": 10000 }, { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp1", "subunit": "Elp1", "domain": "propeller2", "filename": "in_pdbs/Elp1_NTD_2nd_propeller.pdb" } ], "positions": "fits/search100000_metric_cam_inside0.6/emd_4151_binned.mrc/Elp1_NTD_2nd_propeller.pdb/solutions_pvalues.csv", "positions_type": "chimera", "max_positions": 10000 }, { "components": [ { "name": "Elp1", "subunit": "Elp1", "serie": "2fold", "copies": [0], "chain_id": "G", "domain": "CTD", "filename": "in_pdbs/Elp1.CTD.on5cqs.5cqr.model_ElNemo_mode7.pdb"}, { "name": "Elp1", "subunit": "Elp1", "serie": "2fold", "copies": [1], "chain_id": "H", "domain": "CTD", "filename": "in_pdbs/Elp1.CTD.on5cqs.5cqr.model_ElNemo_mode7.pdb"} ], "positions": "fits/search100000_metric_cam_inside0.6/emd_4151_binned.mrc/Elp1.CTD.on5cqs.5cqr.model_ElNemo_mode7.pdb/solutions_pvalues.csv", "positions_type": "chimera", "max_positions": 1 }, { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp2", "subunit": "Elp2", "filename": "in_pdbs/Elp2.pdb"} ], "positions": "fits/search100000_metric_cam_inside0.6/emd_4151_binned.mrc/Elp2.pdb/solutions_pvalues.csv", "positions_type": "chimera", "max_positions": 10000 }, { "foreach_serie": true, "foreach_copy": true, "components": [ { "name": "Elp3", "subunit": "Elp3", "filename": "in_pdbs/Elp3.mono.pdb"} ], "positions": "fits/search100000_metric_cam_inside0.6/emd_4151_binned.mrc/Elp3.mono.pdb/solutions_pvalues.csv", "positions_type": "chimera", "max_positions": 10000 } ] }, { "some xlink definition 1" }, { "some xlink definition 2" }, { "sequence file definition" } ], "subunits": [ "subunit definitions" ], "xlinkanalyzerVersion": "..." }
And that’s it!