How much of a near infrared spectrum is useful? Sparse regularization---let the data decide!


  • Robert Scott Anderssen
  • Frank Robert de Hoog CSIRO Maths Infor Stats
  • Ian J. Wesley
  • Alec Zwart CSIRO Maths Info Stats



near infra red, sparse regularization, derivative spectroscopy, casein


In information recovery from indirect measurements of the phenomenon of interest (e.g. near infrared spectra of milk powders or pharmaceuticals, or Raman spectra of explosives or anaesthetics) the available data can be partitioned into two separate components: (i) the information which encapsulates the answer to the question under examination (the proportion of casein, the major protein component, in milk powder; the presence or absence of explosives; the monitoring of anaesthetic and respiratory levels during surgery) and (ii) a considerable amount of superfluous information, the presence of which compromises the reliability of the answer to the question of interest. In such spectroscopic situations, for the identification of the information that encapsulates the answer, a variety of techniques are used such as partial least squares, neural networks and support vector machines. With respect to the available calibration data, the support vector machines procedure performs an implicit form of sparse regularization. In this article, the aim is to show how, using the Beer--Lambert law and derivative spectroscopy, the sparse regularization is performed in an explicit manner. This information can be subsequently utilized to construct, using statistical regression, an appropriate predictor. Here, the goal is to give a proof-of-concept for the application of derivative spectroscopy as an explicit sparse regularization protocol. For this, the calibration data consists of near infrared spectra of milk powder spiked with known amounts of casein, while the property of interest is the proportion of casein in the milk powder. References
  • R. S. Anderssen, F. R. de Hoog, and I. J. Wesley. Information recovery from near infrared data. In W. McLean and A. J. Roberts, editors, Proceedings of the 15th Biennial Computational Techniques and Applications Conference, CTAC-2010, volume 52 of ANZIAM J., pages C333–C348, July 2011.
  • R. S. Anderssen and M. Hegland. Derivative Spectroscopy – An enhanced role for numerical differentiation. J. Integral Equat. Appl., 22(3):355–367, 2010. doi:10.1216/JIE-2010-22-3-355.
  • R. S. Anderssen, M. Hegland, and I. J. Wesley. Resolution enhancement for infrared spectroscopic data. In MODSIM2011, 19th International Congress of Modelling and Simulation, pages 371–377. Modelling and Simulation Society of Australian and New Zealand, 2011.
  • R. S. Anderssen, B. G. Osborne, and I. J. Wesley. The application of localisation to near infrared calibration and prediction through partial least squares regression. JNIRS, 11(1):39–48, 2003. doi:10.1255/jnirs.352.
  • T. Naes, T. Isaksson, T. Fearn, and T. Davies. A User-Friendly Guide to Multivariate Calibration and Classification. NIR Publications, Chichester, UK, 2002.
  • B. G. Osborne. Near-infrared spectroscopy in food analysis. In Encyclopedia of Analytic Chemistry, pages 1–14. J. Wiley and Sons, Chichester, UK, 2006. doi:10.1002/9780470027318.a1018.
  • B. G. Osborne, T. Fearn, and P. H. Hindle. Practical NIR Spectroscopy with Applications in Food and Beverage Analysis. Longman Scientific and Technical, Harlow, UK, 1993. McGraw-Hill Series in Higher Mathematics.
  • I. J. Wesley, B. G. Osborne, R. S. Anderssen, S. R. Delwiche, and R. A. Graybosch. Chemometric localization approach to NIR measurement of apparent amylose content of ground wheat. Cereal Chem., 80(4):462–467, 2003. doi:10.1094/CCHEM.2003.80.4.462.





Proceedings Computational Techniques and Applications Conference