main.dox
No OneTemporary
Actions

Size

5 KB

Subscribers

None

main.dox
View Options

	/// @mainpage
	///
	/// YODA is a package for creation and analysis of statistical data, written in
	/// C++ and usable from C++ and Python code. The development and developers of
	/// YODA have emerged from the sub-field of Monte Carlo event generator
	/// validation and tuning in high-energy physics (HEP), but very intentionally
	/// there is nothing in YODA which is specific to that application!
	///
	/// HEP researchers may reasonably ask why we would develop another data
	/// analysis package when the ROOT (http://cern.ch/root/) package is so well
	/// established in our field and there are also some less well-known
	/// alternatives like AIDA. Well, here are a few comments along those lines,
	/// along the way illustrating the design principles of YODA:
	///
	/// - ROOT is a very large and monolithic package, containing many, many
	/// features other than basic data analysis types. For our applications it was
	/// not desirable to introduce such a huge dependency just to get some
	/// histogramming functionality. <b>YODA design aim #1: be small and
	/// to-the-point, i.e. do <i>one</i> job really well.</b>
	///
	/// - ROOT and AIDA do not support sparse binning, i.e. gaps between histogram
	/// bins in either 1D or 2D histograms. This feature is required in general,
	/// and in particular had long been a problem for the Rivet and Professor
	/// systems which used an AIDA implementation for several years and required
	/// post-processing scripts to strip out the spurious bins. The only neat
	/// solution was to write our own histogram classes -- along the way making it
	/// fast despite the very general implementation. <b>YODA design aim #2: support
	/// general requirements of data objects.</b>
	///
	/// - ROOT is infamously difficult to use as a library: it likes to take over
	/// the command line parsing, manage object memory with hidden state (leading
	/// to crashes or memory leaks), etc. We didn't want to either have to deal
	/// with those problems ourselves, nor to pass them on to YODA. AIDA is also
	/// far from blameless from a programming usability point of view, having
	/// apparently been designed based on an overenthusiastic reading of a design
	/// patterns book without considering whether using factory objects for
	/// everything, or its insistence on a broken filesystem path metaphor, would
	/// have an impact on users. <b>YODA design aim #3: make the nicest, sanest,
	/// programming interface for data analysis that we possibly can.</b>
	///
	/// - In ROOT in particular, histograms as statistical data objects and
	/// histograms as graphical data representations are conflated concepts. It's
	/// hence impossible to declare some data as constant (for safety) without
	/// then being unable to change its plotting style. We don't like that. We
	/// also want to avoid common mistakes such as confusing the height of a
	/// histogram bin (in representation terms) with its statistical content --
	/// YODA is designed to make the meaning of bin contents very clear and
	/// unambiguous. <b>YODA design aim #4: separate data handling from
	/// presentation.</b> The current implementation is dominated by the
	/// statistical data handling and numerical correctness -- existing systems
	/// like the Rivet make-plots script are relied upon for publication-quality
	/// plotting.
	///
	/// - Event generators of various kinds do not always produce simulated events
	/// with a physical distribution: the events themselves may be weighted for
	/// technical or efficiency reasons. They may also produce negative weights,
	/// or a whole collection of weights with different meanings for each event:
	/// these aspects of statistical analysis are not well-served by existing
	/// systems. <b>YODA design aim #5: handle weights in the best possible way,
	/// including negative weights and weight vectors.</b>
	///
	/// - Like ROOT and AIDA, the emphasis of YODA is on reproducible, programmatic
	/// data analysis. Plots should not normally require manual intervention: if
	/// made via a script or program, it is possible to reproduce plots after a
	/// long delay or when the original author has moved on... that's good for
	/// science. Unlike ROOT, we don't think that C++ is a sane scripting
	/// language. <b>YODA design aim #6: enable pleasant, clear programmatic
	/// analysis and plotting from C++ programs or Python scripts.</b>
	///
	/// - For aggregated data such as histograms, it is not essential to have a very
	/// disk-efficient persistency format. In fact, it's more important that the
	/// data be readily legible and editable. XML does not count. This leads to
	/// <b>YODA design aim #7: the principle persistency format for YODA is a
	/// compact ASCII text format.</b> Scripts are provided for easy visualisation of
	/// this format and conversion of it to other common formats, including ROOT
	/// files. An API is provided for persistency extension, and support will be
	/// provided on demand for standard efficient binary formats such as MsgPack
	/// and HDF5.
	///
	/// - Finally, high-statistics analysis requires the ability to split analyses
	/// into many small runs which are executed in parallel. <b>YODA design aim
	/// #8: store all the information required for complete and exact
	/// reconstruction of up to second-order statistical moments of complete runs
	/// from merging of constituent equivalent or distinct sub-runs. Store
	/// sufficient data that normal post-processing operations such as scaling or
	/// division may be operated <i>after</i> a run-merging phase.</b>

File Metadata

Mime Type: text/plain
Expires: Wed, May 14, 11:59 AM (1 h, 23 m)
Storage Engine: blob
Storage Format: Raw Data
Storage Handle: 5111580
Default Alt Text: main.dox (5 KB)

main.doxNo OneTemporaryActions

main.doxView Options

File Metadata

Event Timeline

main.dox
No OneTemporary
Actions

main.dox
View Options