NEWS
ndl 0.2.17 (2015-11-18)
- Removed unused openmp imports.
- Removed duplications and inconsistancies in DESCRIPTION file and package info.
ndl 0.2.16 (2014-02-27)
- Fixed error that resulted in wrong estimates when estimateWeights was used
under some (all?) circumstances.
- Hmisc becomes dependency.
ndl 0.2.14 (2013-11-12)
- Fixed error that caused R to crash during garbage collection. This may have
been due to a mysterious interactions between Rcpp and gc, but I am not sure
what it was. Changing the type of returned object from Rcpp::List to SEXP fixed it.
- Other minor fixes and error checking.
ndl 0.2.13 (2013-11-06)
- Fixed a big bug in estimateWeights that reordered rows of the output.
- Fixed a bug in orthoCoding that was caused by global variable collisions.
- Added more error checking on input to functions.
ndl 0.2.11
- Fixed problem with print.ndlClassify.
- Fixed issue with OpenMP compilation
ndl 0.2.10
- Fixed a bug in c++ module that added extra column to output matrix.
ndl 0.2.9
- Added a better Serbian Unicode data set that uses Cyrillic instead of Latin letters, and has
the proper Unicode encoding set.
ndl 0.2.6
unreleased versions)
- Fixed bugs and improved performance.
- Prepared software for CRAN release.
- Due to certain software dependencies, this release of ndl might not
compile on Mac OS X 10.6 and earlier. If this is the case, please
continue to use ndl version 0.1.6. The results produced are
numerically almost identical, with slower performance being the main
difference
- new option for estimateWeights : hasUnicode. This must be used when rownames
contain Unicode
- other new options: addBackground and trueProbabilities (see the docs for
explanation of what they do.) They are turned off by default to maintain continuity with
ndl 0.2.1
- Fixed some bugs in estimateWeights.R relating to "addBackground".
- Changed the way that duplicate cues were counted when "RemoveDuplicates" was
set to FALSE
- Added a helper python script in inst/scripts called "ndl.preproc.py". This script
can be used to convert a text corpus into a list of learning events. See the script for
help on using it
- Duplicate outcomes in events are now removed by default.
ndl 0.2.0
- added the ability to use the new Compact Event format in the function
estimateWeightsCompact
- added a new low-rank approximation of the SVD for the pseudoinverse in the case
that there are more than 20000 cues
ndl 0.1.6 (2012-12-23)
- Removed the program cooc.c and re-implemented the event processing code
- Parallelized where possible using OpenMP
ndl 0.1.1 (2011-06-25)
- General: general modifications and fixes to several functions to
check the integrity of input (mainly argument 'cuesOutcomes'), and
otherwise with deal with diverse input formats with less hickups;
new function 'predict.ndlClassify'; user-specifiable scalability for
dealing with really large input files when using the C code method
for the calculation of Cue-Cue and Cue-Outcome co-occurrences;
maintainer e-mail address modified to a generic one.
- acts2probs.R (acts2probs): code modified internally so that a
minimum activation value of 'acts.minimum.correction=1e-10' is used
instead when any activation value is exactly equal to 0 in input
'acts' (due to lack of any previously observed cues in input to
e.g. 'estimateActivations').
- coocMatrices.c (cooc): C code fixed so that it will work with space
characters ' ' in Cues and Outcomes (as input 'cuesOutcomes' to
'estimateWeights', 'cooccurrencesCues' and/or
'cooccurrencesCuesOutcomes') - due to this fix the 'Outcomes',
'Cues' and 'Frequency' columns are now separated by single tabulator
characters '\t' (instead of spaces ' ') in the temporary text file:
'ndl_410025912.txt' through which data is passed to the C code from
the 'estimateWeights', 'cooccurrencesCues' and
'cooccurencesCuesOutcomes' functions ; C code modified so that the
sizes of Cues and Outcomes (and their combinations) can be specified
flexibly by the user in 'estimateWeights', 'cooccurrencesCues'
and/or 'cooccurrencesCuesOutcomes' beyond the default values
(max.cues=20000, max.characters=20000, max.lines=500000;
communicated via the temporary parameter file:
'ndl_par_410025912.txt'), which will allow the use of the C code
alternative (with 'method="C"') via greater memory allocation for
very large data sets on servers with (sufficient) greater capacity;
C code modified so that upon encountering errors (not finding the
parameter or data text files, or the data file exceeding
'max.lines'), which normally will all be detected by the calling R
functions but can occur if the C code is called directly, an
appropriate error message will be output to: 'ndl_err_410025912.txt'
and the execution will end gracefully with 'return' (instead of
'exit').
- cooccurrencesCues.R (cooccurrencesCues): new arguments 'max.cues',
'max.characters' and 'max.lines' added and code modified so that the
sizes of Cues and Outcomes in input 'cuesOutcomes'can be specified
by the user beyond the default values (i.e. max.cues=20000,
max.characters=20000, max.lines=500000; communicated via the
temporary parameter file: 'ndl_par_410025912.txt' to the auxiliary
C code alternative); global option "ndl.estimateWeights" indicating
call from 'estimateWeights' checked so that the auxiliary functions
'cooccurrenceCues' and 'cooccurrenceCuesOutcomes' will not rewrite
the input text file: 'ndl_410025912.txt'.
- cooccurrenceCuesOutcomes.R (cooccurrencesCuesOutcomes): new
arguments 'max.cues', 'max.characters' and 'max.lines' added and
code modified so that the sizes of Cues and Outcomes in input
'cuesOutcomes'can be specified by the user beyond the default values
(i.e. max.cues=20000, max.characters=20000, max.lines=500000;
communicated via the temporary parameter file:
'ndl_par_410025912.txt' to the auxiliary C code alternative); ;
global option "ndl.estimateWeights" indicating call from
'estimateWeights' checked so that the auxiliary functions
'cooccurrenceCues' and 'cooccurrenceCuesOutcomes' will not rewrite
the input text file: 'ndl_410025912.txt'.
- estimateActivations.R (estimateActivations): code and output values
changed so that previously unseen Cues and Outcomes in input
'cuesOutcomes' are included in the output values, in addition to
'activationMatrix', as 'newCues' and 'newOutcomes', instead of these
being printed to standard output; in the case of such unseen
Cues or Outcomes a warning message is now output; code modified so
that a warning message is output if potential accidental 'NA'
strings (resulting from 'as.character(NA)') are detected among the
input Cues and Outcomes in 'cuesOutcomes'; code will stop with an
error message if any actual NA's are encountered among Cues and
Outcomes in 'cuesOutcomes'.
- estimateWeights.R (estimateWeights): new arguments 'max.cues',
'max.characters' and 'max.lines' added and code modified so that the
sizes of Cues and Outcomes in input 'cuesOutcomes' can be specified
by the user beyond the default values (i.e. max.cues=20000,
max.characters=20000, max.lines=500000; communicated via the
temporary parameter file: 'ndl_par_410025912.txt' to the auxiliary
C code alternative); code modified so that the function will stop
with an error message if accidental empty cues (string-initial or
string-final, or multiple string-medial underscores '_'), or NA's
are detected in Cues or Outcomes in 'cuesOutcomes'; a warning
message is output if potential accidental 'NA' strings (resulting
from 'as.character(NA)') are detected among the input Cues and
Outcomes in 'cuesOutcomes'; code modified internally so that Cues
and Outcomes in 'cuesOutcomes' are coerced to character strings, in
case they might be input as factors (which may happen by default
when reading data with 'read.table' or 'read.csv' into R from an
external plain text or CSV file as a data frame); global option
"ndl.estimateWeights" specified as TRUE so that the auxiliary
functions 'cooccurrenceCues' and 'cooccurrenceCuesOutcomes' will not
rewrite the input text file: 'ndl_410025912.txt'; after successful
execution, "ndl.estimateWeights" again specified as NULL.
- ndlClassify.R (ndlClassify): code modified internally so that
character length of response (Outcome) variable in 'formula'
argument is not restricted.
- ndlClassify.R (print.ndlClassify): code modified so that setting
'max.print=NA' will print the entire 'weightMatrix' matrix.
- ndlCrossvalidate.R (ndlCrossvalidate): code modified internally so
that the checks of the 'formula' and 'frequency' arguments do not
produce superfluous warnings.
- ndlCuesOutcomes.R (ndlCuesOutcomes): code modified internally so
that character length of response (Outcome) variable in 'formula'
argument is not restricted.
- predict.ndlClassify.R (predict.ndlClassify): new function that
allows for the estimation of activation-based probabilities and/or
the prediction of Outcomes with new, unseen data, using the weights
in 'weightMatrix' in an object previously fitted with 'ndlClassify'.
- summary.ndlClassify.R (summary.ndlClassify): code modified so that
setting 'max.print=NA' will print out entire 'weights' matrix.
- Documents: Manual page contents modified to correspond to changes in
R code; references in manual pages updated.