==================== What's New (History) ==================== Version 1.5.0 ------------- *Released on September 14 2022* The two big changes are: 1. the ability to use `Yaml `__ files to specify samples, 2. the introduction of ``run_for_all`` (and ``run_for_all_samples``) functions to simplify the usage of the ``parallel`` module (see `standard library docs `__). Several of the other changes were then to support these two features. Additionally, some minor fixes and improvements were made. User-visible Improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - Add ``load_sample_list`` function to load samples in YAML format (see `YAML Samples `_). - Add ``compress_level`` argument to ``write`` function to specify the compression level. - Added ``name()`` method to ``ReadSet`` objects, so you can do:: input = load_fastq_directory("my-sample") print(input.name()) which will print ``my-sample``. - Added ``println`` function which works like ``print`` but prints a newline after the output. - Make ``print()`` accept ints and doubles as well as strings. - Added ``run_for_all`` function to ``parallel`` module, simplifying its `API `__. - When using the ``parallel`` module and a job fails, writes the log to the corresponding ``.failed`` file. - External modules can now use the ``sequenceset`` type to represent a FASTA file. - The ``load_fastq_directory`` function now supports ``.xz`` compressed files. - The ``parallel`` module now checks for stale locks **before** re-trying failed tasks. The former model could lead to a situation where a particular sample failed deterministically and then blocked progress even when some locks were stale. Bugfixes ~~~~~~~~ - The ``parallel`` module should generate a ``.failed`` file for each failed job, but this was not happening in every case. - Fixed parsing of GFF files to support negative values (reported by Josh Sekela on the `mailing-list `__). Version 1.4.2 ------------- Released *21 July 2022* Bugfixes ~~~~~~~~ - Fix bug with parsing GFF files (it was assumed that _scores_ were always positive) Version 1.4.1 ------------- Released *3 June 2022* Bugfixes ~~~~~~~~ - Fix bug with *low memory mode* Version 1.4.0 ------------- Released *30 May 2022* User-visible Improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - ``write()`` now returns the filename used - ``write()`` can use multiple threads - Better error messages in multiple situations - Add a module for `GMGC — Global Microbial Gene Catalogue `__ - Old ``motus`` (version 1) module deprecated Bugfixes ~~~~~~~~ - Update `--install-reference-data` mode to newer URLs, see `#107 `__ - Update `--create-reference-pack` mode to newer format (where indices are versioned), see `#108 `__ - Do not fail when merging empty files (`#113 `__) Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - Better building infrastructure - Switched to the tasty testing framework - ``assemble()`` is now using a more up to date version of megahit, which means that the older versions cannot be run. Version 1.3.0 ------------- Released *28 January 2021* User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - Adds conversion from string to numbers (int or double) and back - Better error message if the user attempts to use the non-existent ``<\>`` operator (suggest ````) - Validate ``count()`` headers on ``--validate-only`` Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - Switched internal interval structure to `interval-int `__. For users using GFF-style annotation in ``count()``, this should result in a significant improvement (less memory, faster performance) - Use zstd compression for more temporary files Bugfixes ~~~~~~~~ - Fix cases where sample names contain ``/`` and ``collect()`` (`issue 141 `__) Version 1.2.0 ------------- Released *12 July 2020*. User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - Added function `load_fastq_directory `__ to the builtin namespace. This was previously available under the ``mocat`` module, but it had become much more flexible than the original MOCAT version, so it was no longer a descriptive name. - Better messages in `parallel `__ module when there are no free locks. Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - Modules can now specify their annotation as a URL that NGLess downloads on a "as needed" basis: in version 1.1, only FASTA files were supported. - Memory consumption of `count() function `__ has been improved when using GFF files (*ca.* ⅓ less memory used). - This one is *hopefully **not** user-visible*: Previously, NGLess would ship the Javascript libraries it uses for the HTML viewer and copy them into all its outputs. Starting in v1.2.0, the HTML viewer links to the live versions online. Version 1.1.1 ------------- This is a bugfix release and results should not change. In particular, a sequence reinjection bug was fixed. Version 1.1.0 ------------- User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - Added `discard_singles() function `__. - Added ``include_fragments`` option to `orf_find() `__. - The `countfile `__ now reorders its input if it is not ordered. This is necessary for correct usage. - More flexible loading of ``functional_map`` arguments in `count `__ to accept multiple comment lines at the top of the file as produced by `eggnog-mapper `__. - Added ``sense`` argument to the `count `__ function, generalizing the previous ``strand`` argument (which is deprecated). Whereas before it was only possible to consider features either to be present on both strands or only on the strand to which they are annotated, now it is also possible to consider them present only on the opposite strand (which is necessary for some strand-specific protocols as they produce the opposite strand). - Added ``interleaved`` argument to `fastq `__ - ``load_mocat_sample`` now checks for mismatched paired samples (`#120 `__) - Better messages when collect call could not finish (following discussion on the `mailing list `__) - Modules can now specify their resources as a URL that NGLess downloads on a "as needed" basis. - `len `__ now works on lists Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - ZSTD compression is available for output and intermediate files use it for reduced temporary space usage (and possibly faster processing). - Faster check for column headers in ``functional_map`` argument to `count() `__ function: now it is performed *as soon as possible* (including at the top of the script if the arguments are literal strings), thus NGLess can fail faster. - ZSTD compression is available for output and intermediate files use it for reduced temporary space usage (and possibly faster processing). - Faster check for column headers in ``functional_map`` argument to `count() `__ function: now it is performed *as soon as possible* (including at the top of the script if the arguments are literal strings), thus NGLess can fail faster. Version 1.0.1 ------------- This is a bugfix release and results should not change. Bugfixes ~~~~~~~~ - Fix bug with external modules and multiple fastQ inputs. - Fix bug with resaving input files where the original file was sometimes moved (thus removing it). - When ``bwa`` or ``samtools`` calls fail, show the user the stdout/stderr from these processes (see `#121 `__). Version 1.0 ----------- User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - The handling of multiple annotations in `count `__ (i.e., when the user requests multiple ``features`` and/or ``subfeatures``) has changed. The previous model caused a few issues (`#63 `__, but also mixing with `collect() `__. Unfortunately, this means that scripts asking for the old behaviour in their version declaration are no longer supported if they use multiple features. Version 0.11 ------------ Released March 15 2019 (**0.11.0**) and March 21 2019 (**0.11.1**). Version 0.11.0 used ZStdandard compression, which was not reliable (the official haskell zstd wrapper has issues). Thus, it was removed in v0.11.1. Using v0.11.0 is **not recommended**. User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - Module samtools (version 0.1) now includes `samtools_view` - Add `--verbose` flag to check-install mode (`ngless --check-install --verbose`) - Add early checks for input files in more situations (`#33 `__) - Support compression in `collect()` output (`#42 `__) - Add `smoothtrim() `__ function Bugfixes ~~~~~~~~ - Fix bug with `orf_find` & `prots_out` argument - Fix bug in garbage collection where intermediate files were often left on disk for far longer than necessary. - Fix CIGAR (`#92 `__) for select() blocks Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - Switched to diagrams package for plotting. This should make building easier as cairo was often a complicated dependency. - Update to LTS-13 (GHC 8.6) - Update minimap2 version to 2.14 - Call bwa/minimap2 with interleaved fastq files. This avoids calling it twice (which would mean that the indices were read twice). - Avoid leaving open file descriptors after FastQ encoding detection - Tar extraction uses much less memory now (`#77 `__) Version 0.10.0 -------------- Released Nov 12 2018 Bugfixes ~~~~~~~~ - Fixed bug where header was printed even when STDOUT was used - Fix to lock1's return value when used with paths (`#68 - reopen `__) - Fixed bug where writing interleaved FastQ to STDOUT did not work as expected - Fix saving fastq sets with --subsample (issue `#85 `__) - Fix (hypothetical) case where the two mate files have different FastQ encodings User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - samtools_sort() now accepts by={name} to sort by read name - Add __extra_megahit_args to assemble() (`issue #86 `__) - arg1 in external modules is no longer always treated as a path - Added expand_searchdir to external modules API (`issue #56 `__) - Support _F/_R suffixes for forward/reverse in load_mocat_sample - Better error messages when version is mis-specified - Support `NO_COLOR `__ standard: when ``NO_COLOR`` is present in the environment, print no colours. - Always check output file writability (`issue #91 `__) - ``paired()`` now accepts ``encoding`` argument (it was documented to, but mis-implemented) Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - NGLess now pre-emptively garbage collects files when they are no longer needed (`issue #79 `__) Version 0.9.1 ------------- Released July 17th 2018 - Added `NGLess preprint citation `__ Version 0.9 ----------- Released July 12th 2018 User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - Added ``allbest()`` method to MappedRead. - NGLess will issue a warning before overwriting an existing file. - Output directory contains PNG files with basic QC stats - Added modules for gut gene catalogs of `mouse `__, `pig `__, and `dog `__ - Updated the `integrated gene catalog `__ Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - All lock files now are continuously "touched" (i.e., their modification time is updated every 10 minutes). This makes it easier to discover stale lock files. - The automated downloading of builtin references now uses versioned URLs, so that, in the future, we can change them without breaking backwards compatibility. Version 0.8.1 ------------- Released June 5th 2018 This is a minor release and upgrading is recommended. Bugfixes ~~~~~~~~ - Fix for systems with non-working locale installations - Much faster `collect `__ calls - Fixed `lock1 `__ when used with full paths (see `issue #68 `__) - Fix expansion of searchpath with external modules (see `issue #56 `__) Version 0.8 ----------- Released May 6th 2018 Incompatible changes ~~~~~~~~~~~~~~~~~~~~ - Added an extra field to the FastQ statistics, with the fraction of basepairs that are not ATCG. This means that uses of `qcstats `__ must use an up-to-date version declaration. - In certain cases (see below), the output of count when using a GFF will change. User-visible improvements ~~~~~~~~~~~~~~~~~~~~~~~~~ - Better handling of multiple features in a GFF. For example, using a GFF containing "gene_name=nameA,nameB" would result in:: nameA,nameB 1 Now the same results in:: nameA 1 nameB 1 This follows after `https://git.io/vpagq `__ and the case of *Parent=AF2312,AB2812,abc-3* - Support for `minimap2 `__ as alternative mapper. Import the ``minimap2`` module and specify the ``mapper`` when calling `map `__. For example:: ngless '0.8' import "minimap2" version "1.0" input = paired('sample.1.fq', 'sample.2.fq', singles='sample.singles.fq') mapped = map(input, fafile='ref.fna', mapper='minimap2') write(mapped, ofile='output.sam') - Added the ```` operator. This can be used to concatenate filepaths. ``p0 p1`` is short for ``p0 + "/" + p1`` (except that it avoids double forward slashes). - Fixed a bug in `select `__ where in some edge cases, the sequence would be incorrectly omitted from the result. Given that this is a rare case, if a version prior to 0.8 is specified in the version header, the old behaviour is emulated. - Added bzip2 support to `write `__. - Added reference argument to `count `__. Bug fixes ~~~~~~~~~ - Fix writing multiple compressed Fastq outputs. - Fix corner case in `select `__. Previously, it was possible that some sequences were wrongly removed from the output. Internal improvements ~~~~~~~~~~~~~~~~~~~~~ - Faster `collect() `__ - Faster FastQ processing - Updated to bwa 0.7.17 - External modules now call their init functions with a lock - Updated library collection to LTS-11.7 Version 0.7.1 ------------- Released Mar 17 2018 Improves memory usage in ``count()`` and the use the ``when-true`` flag in external modules. Version 0.7 ----------- Released Mar 7 2018 New functionality in NGLess language ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Added `max_trim `__ argument to ``filter`` method of ``MappedReadSet``. - Support saving compressed SAM files - Support for saving interleaved FastQ files - Compute number Basepairs in FastQ stats - Add ``headers`` argument to `samfile function `__ Bug fixes ~~~~~~~~~ - Fix ``count``'s mode ``{intersection_strict}`` to no longer behave as ``{union}`` - Fix ``as_reads()`` for single-end reads - Fix ``select()`` corner case In addition, this release also improves both speed and memory usage. Version 0.6 ----------- Released Nov 29 2017 Behavioural changes ~~~~~~~~~~~~~~~~~~~ - Changed ``include_m1`` default in `count() `__ function to True New functionality in NGLess language ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Added `orf_find `__ function (implemented through Prodigal) for open reading frame (ORF) predition - Add `qcstats() `__ function to retrieve the computed QC stats. - Added reference alias for a more human readable name - Updated builtin referenced to include latest releases of assemblies New functionality in NGLess tools ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Add --index-path functionality to define where to write indices. - Allow `citations` as key in external modules (generally better citations information) - Use multiple threads in SAM->BAM conversion - Better error checking/script validation Bug fixes ~~~~~~~~~ - Output preprocessed FQ statistics (had been erroneously removed) - Fix --strict-threads command-line option spelling - Version embedded megahit binary - Fixed inconsistency between reference identifiers and underlying files Version 0.5.1 ------------- Released Nov 2 2017 Fixed some build issues Version 0.5 ----------- Released Nov 1 2017 First release supporting all basic functionality.