Knowledge book on lignocellulose deconstruction: an INRA project to identify key actions in research on biorefineries

The existence of pilot and industrial scale biorefineries worldwide demonstrates the technical and economic feasibility of fractionating lignocellulose (LC) for chemistry and energy. This raises new questions about the biomass supply, management of its quality and about the elementary step combination in processes, to choose which compound will be main- or co-products from plant biomass. Integrated and systemic approaches are requested to invent and/or to improve the biotechnical fractionation of LCs and there is a need to collect and correlate the existing knowledge in a structured way, to gain a better insight of the overall process. Building such a knowledge representation is important for scientists, research institutes, universities and industries, as it will give a shared description of the knowledge in that field, that will further facilitate its diffusion, re-use, review, reassessment and updating with new findings. Practically, an extensive literature has been published in the past five years on the biorefinery of LC, focusing mostly on the saccharification of polysaccharides (cellulose, hemicellulose). As a consequence, most of the data available results from biochemical and physicochemical analyzes from several processing chains, which combine different modes of physical pretreatments and/or chemical typologies of variables biomass and/or various hydrolytic enzyme cocktails. That is why a project for development of a hypertext electronic Knowledge Book on LignoCellulose DeConstruction (KB-LCDC) was initiated by the French National Institute for Agricultural Research (INRA) with two main goals: i/ to elicit the available knowledge from various sources, more specifically related to the enzymatic hydrolysis of wheat straw into glucose, ii/ to represent the knowledge and implement it into a web-based format of Knowledge Book (KB) taking into account the overall saccharification process. The knowledge was first elicited by means of semi-structured interviews with a group of six experts working in several INRA research Units, and involved in the Institute’s biomass transformation network. Concomitantly, the collating of data and knowledge from “grey-” and peer reviewed- literature was also done. Then, our approach consists in building a knowledge book (KB) whose pages are formatted concept maps (Cmap) and technical sheets that are connected by hypertext links. A Cmap is a semantic graph where nodes represent concepts that are connected by arcs expressing relationships between them. A formatted Cmap answers a specific question about one central concept (for instance: what is the impact of the pretreatment on the reactivity of biomass ? How does enzyme diffuse into the lignocellulose ?). Hyperlinks existing between Cmaps and technical sheets form a network of knowledge, into which the user can navigate, to find relevant answers, but also associated concepts. Hyperlinks can also link Cmaps or technical sheet to an Internet page, scientific article and any document selected to illustrate the reality of a concept. Up to nine knowledge areas have been identified so far, among them: biomass pretreatment; separation methods; enzyme cocktails; substrate reactivity, hydrolysis mechanisms. A global representation of the overall process from wheat straw to glucose, based on the individual Cmaps, has been built. It includes a static structural view (environment and reactivity of the media, encompassing the cell wall); a dynamic view (hierarchy of the different sub-processes at work) and a functional view (description of the elementary steps and how they are organized in time). In the frame of the LBT III congress, the practical structuration of the knowledge and the original version of the KB will be disclosed. The potential development and use of this new approach for the representation of biotechnology processes applied to LC will be discussed (process workflow; unlocking cell wall recalcitrance; strategic roadmaps).