Patrice Béchard, Orlando Marquez Ayala
Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying
Large Language Models (LLMs), as it can address typical limitations such as
generating hallucinated or outdated information. However, when building
real-world RAG applications, practical issues arise. First, the retrieved
information is generally domain-specific. Since it is computationally expensive
to fine-tune LLMs, it is more feasible to fine-tune the retriever to improve
the quality of the data included in the LLM input. Second, as more applications
are deployed in the same real-world system, one cannot afford to deploy
separate retrievers. Moreover, these RAG applications normally retrieve
different kinds of data. Our solution is to instruction fine-tune a small
retriever encoder on a variety of domain-specific tasks to allow us to deploy
one encoder that can serve many use cases, thereby achieving low-cost,
scalability, and speed. We show how this encoder generalizes to out-of-domain
settings as well as to an unseen retrieval task on real-world enterprise use
cases.
Authors' comments: 9 pages, 2 figures. Submitted to NAACL 2025 Industry Track
Charles Corbière, Simon Roburin, Syrielle Montariol, Antoine Bosselut, Alexandre Alahi
While chain-of-thought (CoT) prompting improves reasoning in large language
models, its effectiveness in vision-language models (VLMs) remains limited due
to over-reliance on textual cues and memorized knowledge. To investigate the
visual reasoning capabilities of VLMs in complex real-world scenarios, we
introduce DrivingVQA, a visual question answering dataset derived from driving
theory exams, which contains 3,931 multiple-choice problems with expert-written
explanations and grounded entities relevant to the reasoning process.
Leveraging this dataset, we propose RIV-CoT, a Retrieval-Based Interleaved
Visual Chain-of-Thought method that enables VLMs to reason using visual crops
corresponding to these relevant entities. Our experiments demonstrate that
RIV-CoT improves answer accuracy by 3.1% and reasoning accuracy by 4.6% over
vanilla CoT prompting. Furthermore, we demonstrate that our method effectively
scales to the larger A-OKVQA reasoning dataset by leveraging automatically
generated pseudo-labels, outperforming CoT prompting.
Authors' comments: Project page: https://vita-epfl.github.io/DrivingVQA
Yannis Katsis, Sara Rosenthal, Kshitij Fadnis, Chulaka Gunasekara, Young-Suk Lee, Lucian Popa, Vraj Shah, Huaiyu Zhu et al.
Retrieval-augmented generation (RAG) has recently become a very popular task for Large Language Models (LLMs). Evaluating them on multi-turn RAG conversations, where the system is asked to generate a response to a question in the context of a preceding conversation is an important and often overlooked task with several additional challenges. We present MTRAG: an end-to-end human-generated multi-turn RAG benchmark that reflects several real-world properties across diverse dimensions for evaluating the full RAG pipeline. MTRAG contains 110 conversations averaging 7.7 turns each across four domains for a total of 842 tasks. We also explore automation paths via synthetic data and LLM-as-a-Judge evaluation. Our human and automatic evaluations show that even state-of-the-art LLM RAG systems struggle on MTRAG. We demonstrate the need for strong retrieval and generation systems that can handle later turns, unanswerable questions, non-standalone questions, and multiple domains. MTRAG is available at https://github.com/ibm/mt-rag-benchmark.
Wen-Dong Jiang, Chih-Yung Chang, Diptendu Sinha Roy
Recently, violence detection systems developed using unified multimodal
models have achieved significant success and attracted widespread attention.
However, most of these systems face two critical challenges: the lack of
interpretability as black-box models and limited functionality, offering only
classification or retrieval capabilities. To address these challenges, this
paper proposes a novel interpretable violence detection system, termed the
Three-in-One (TIO) System. The TIO system integrates knowledge graphs (KG) and
graph attention networks (GAT) to provide three core functionalities:
detection, retrieval, and explanation. Specifically, the system processes each
video frame along with text descriptions generated by a large language model
(LLM) for videos containing potential violent behavior. It employs ImageBind to
generate high-dimensional embeddings for constructing a knowledge graph, uses
GAT for reasoning, and applies lightweight time series modules to extract video
embedding features. The final step connects a classifier and retriever for
multi-functional outputs. The interpretability of KG enables the system to
verify the reasoning process behind each output. Additionally, the paper
introduces several lightweight methods to reduce the resource consumption of
the TIO system and enhance its efficiency. Extensive experiments conducted on
the XD-Violence and UCF-Crime datasets validate the effectiveness of the
proposed system. A case study further reveals an intriguing phenomenon: as the
number of bystanders increases, the occurrence of violent behavior tends to
decrease.
Authors' comments: This work has been submitted to the IEEE for possible publication
Yindu Su, Huike Zou, Lin Sun, Ting Zhang, Haiyang Yang, Liyu Chen, David Lo, Qingheng Zhang et al.
Product Attribute Value Identification (PAVI) involves identifying attribute values from product profiles, a key task for improving product search, recommendations, and business analytics on e-commerce platforms. However, existing PAVI methods face critical challenges, such as inferring implicit values, handling out-of-distribution (OOD) values, and producing normalized outputs. To address these limitations, we introduce Taxonomy-Aware Contrastive Learning Retrieval (TACLR), the first retrieval-based method for PAVI. TACLR formulates PAVI as an information retrieval task by encoding product profiles and candidate values into embeddings and retrieving values based on their similarity to the item embedding. It leverages contrastive training with taxonomy-aware hard negative sampling and employs adaptive inference with dynamic thresholds. TACLR offers three key advantages: (1) it effectively handles implicit and OOD values while producing normalized outputs; (2) it scales to thousands of categories, tens of thousands of attributes, and millions of values; and (3) it supports efficient inference for high-load industrial scenarios. Extensive experiments on proprietary and public datasets validate the effectiveness and efficiency of TACLR. Moreover, it has been successfully deployed in a real-world e-commerce platform, processing millions of product listings daily while supporting dynamic, large-scale attribute taxonomies.
Binita Saha, Utsha Saha, Muhammad Zubair Malik
This work presents a novel architecture for building Retrieval-Augmented Generation (RAG) systems to improve Question Answering (QA) tasks from a target corpus. Large Language Models (LLMs) have revolutionized the analyzing and generation of human-like text. These models rely on pre-trained data and lack real-time updates unless integrated with live data tools. RAG enhances LLMs by integrating online resources and databases to generate contextually appropriate responses. However, traditional RAG still encounters challenges like information dilution and hallucinations when handling vast amounts of data. Our approach addresses these challenges by converting corpora into a domain-specific dataset and RAG architecture is constructed to generate responses from the target document. We introduce QuIM-RAG (Question-to-question Inverted Index Matching), a novel approach for the retrieval mechanism in our system. This strategy generates potential questions from document chunks and matches these with user queries to identify the most relevant text chunks for generating accurate answers. We have implemented our RAG system on top of the open-source Meta-LLaMA3-8B-instruct model by Meta Inc. that is available on Hugging Face. We constructed a custom corpus of 500+ pages from a high-traffic website accessed thousands of times daily for answering complex questions, along with manually prepared ground truth QA for evaluation. We compared our approach with traditional RAG models using BERT-Score and RAGAS, state-of-the-art metrics for evaluating LLM applications. Our evaluation demonstrates that our approach outperforms traditional RAG architectures on both metrics.
Yubo Wang, Haoyang Li, Fei Teng, Lei Chen
Text classification is a fundamental task in data mining, pivotal to various applications such as tabular understanding and recommendation. Although neural network-based models, such as CNN and BERT, have demonstrated remarkable performance in text classification, their effectiveness heavily relies on abundant labeled training data. This dependency makes these models less effective in dynamic few-shot text classification, where labeled data is scarce, and new target labels frequently appear based on application needs. Recently, large language models (LLMs) have shown promise due to their extensive pretraining and contextual understanding ability. Current approaches provide LLMs with text inputs, candidate labels, and additional side information (e.g., descriptions) to classify texts. However, their effectiveness is hindered by the increased input size and the noise introduced through side information processing. To address these limitations, we propose a graph-based online retrieval-augmented generation framework, namely GORAG, for dynamic few-shot text classification. Rather than treating each input independently, GORAG constructs and maintains a weighted graph by extracting side information across all target texts. In this graph, text keywords and labels are represented as nodes, with edges indicating the correlations between them. To model these correlations, GORAG employs an edge weighting mechanism to prioritize the importance and reliability of extracted information and dynamically retrieves relevant context using a minimum-cost spanning tree tailored for each text input. Empirical evaluations demonstrate that GORAG outperforms existing approaches by providing more comprehensive and precise contextual information.
Zhuo Chen, Jiawei Liu, Yuyang Gong, Miaokun Chen, Haotan Liu, Qikai Cheng, Fan Zhang, Wei Lu et al.
Retrieval-Augmented Generation (RAG) enriches LLMs by dynamically retrieving
external knowledge, reducing hallucinations and satisfying real-time
information needs. While existing research mainly targets RAG's performance and
efficiency, emerging studies highlight critical security concerns. Yet, current
adversarial approaches remain limited, mostly addressing white-box scenarios or
heuristic black-box attacks without fully investigating vulnerabilities in the
retrieval phase. Additionally, prior works mainly focus on factoid QA tasks,
their attacks lack complexity and can be easily corrected by advanced LLMs. In
this paper, we investigate a more realistic and critical threat scenario:
adversarial attacks intended for opinion manipulation against black-box RAG
models, particularly on controversial topics. Specifically, we propose
FlippedRAG, a transfer-based adversarial attack against black-box RAG systems.
We first demonstrate that the underlying retriever of a black-box RAG system
can be reverse-engineered, enabling us to train a surrogate retriever.
Leveraging the surrogate retriever, we further craft target poisoning triggers,
altering vary few documents to effectively manipulate both retrieval and
subsequent generation. Extensive empirical results show that FlippedRAG
substantially outperforms baseline methods, improving the average attack
success rate by 16.7%. FlippedRAG achieves on average a 50% directional shift
in the opinion polarity of RAG-generated responses, ultimately causing a
notable 20% shift in user cognition. Furthermore, we evaluate the performance
of several potential defensive measures, concluding that existing mitigation
strategies remain insufficient against such sophisticated manipulation attacks.
These results highlight an urgent need for developing innovative defensive
solutions to ensure the security and trustworthiness of RAG systems.
Authors' comments: arXiv admin note: text overlap with arXiv:2407.13757
Ran Tao, Chong Wang, Hao Chen, Mingjiao Jia, Xiang Shang, Luoyuan Qu, Guoliang Shentu, Yanyu Lu et al.
Accurate detection of wind fields within the troposphere is essential for
atmospheric dynamics research and plays a crucial role in extreme weather
forecasting. Coherent Doppler wind lidar (CDWL) is widely regarded as the most
suitable technique for high spatial and temporal resolution wind field
detection. However, since coherent detection relies heavily on the
concentration of aerosol particles, which cause Mie scattering, the received
backscattering lidar signal exhibits significantly low intensity at high
altitudes. As a result, conventional methods, such as spectral centroid
estimation, often fail to produce credible and accurate wind retrieval results
in these regions. To address this issue, we propose LWFNet, the first
Lidar-based Wind Field (WF) retrieval neural Network, built upon Transformer
and the Kolmogorov-Arnold network. Our model is trained solely on targets
derived from the traditional wind retrieval algorithm and utilizes radiosonde
measurements as the ground truth for test results evaluation. Experimental
results demonstrate that LWFNet not only extends the maximum wind field
detection range but also produces more accurate results, exhibiting a level of
precision that surpasses the labeled targets. This phenomenon, which we refer
to as super-accuracy, is explored by investigating the potential underlying
factors that contribute to this intriguing occurrence. In addition, we compare
the performance of LWFNet with other state-of-the-art (SOTA) models,
highlighting its superior effectiveness and capability in high-resolution wind
retrieval. LWFNet demonstrates remarkable performance in lidar-based wind field
retrieval, setting a benchmark for future research and advancing the
development of deep learning models in this domain.
Authors' comments: 13 pages, 7 figures
Sung Jin Um, Dongjin Kim, Sangmin Lee, Jung Uk Kim
The goal of video moment retrieval and highlight detection is to identify
specific segments and highlights based on a given text query. With the rapid
growth of video content and the overlap between these tasks, recent works have
addressed both simultaneously. However, they still struggle to fully capture
the overall video context, making it challenging to determine which words are
most relevant. In this paper, we present a novel Video Context-aware Keyword
Attention module that overcomes this limitation by capturing keyword variation
within the context of the entire video. To achieve this, we introduce a video
context clustering module that provides concise representations of the overall
video context, thereby enhancing the understanding of keyword dynamics.
Furthermore, we propose a keyword weight detection module with keyword-aware
contrastive learning that incorporates keyword information to enhance
fine-grained alignment between visual and textual features. Extensive
experiments on the QVHighlights, TVSum, and Charades-STA benchmarks demonstrate
that our proposed method significantly improves performance in moment retrieval
and highlight detection tasks compared to existing approaches. Our code is
available at: https://github.com/VisualAIKHU/Keyword-DETR
Authors' comments: Accepted at AAAI 2025
Mehmet Deniz Türkmen, Mucahid Kutlu, Bahadir Altun, Gokalp Cosgun
Building test collections for Information Retrieval evaluation has traditionally been a resource-intensive and time-consuming task, primarily due to the dependence on manual relevance judgments. While various cost-effective strategies have been explored, the development of such collections remains a significant challenge. In this paper, we present GenTREC , the first test collection constructed entirely from documents generated by a Large Language Model (LLM), eliminating the need for manual relevance judgments. Our approach is based on the assumption that documents generated by an LLM are inherently relevant to the prompts used for their generation. Based on this heuristic, we utilized existing TREC search topics to generate documents. We consider a document relevant only to the prompt that generated it, while other document-topic pairs are treated as non-relevant. To introduce realistic retrieval challenges, we also generated non-relevant documents, ensuring that IR systems are tested against a diverse and robust set of materials. The resulting GenTREC collection comprises 96,196 documents, 300 topics, and 18,964 relevance "judgments". We conducted extensive experiments to evaluate GenTREC in terms of document quality, relevance judgment accuracy, and evaluation reliability. Notably, our findings indicate that the ranking of IR systems using GenTREC is compatible with the evaluations conducted using traditional TREC test collections, particularly for P@100, MAP, and RPrec metrics. Overall, our results show that our proposed approach offers a promising, low-cost alternative for IR evaluation, significantly reducing the burden of building and maintaining future IR evaluation resources.
Zhe Chen, Yusheng Liao, Shuyang Jiang, Pingjie Wang, Yiqiu Guo, Yanfeng Wang, Yu Wang
Large language models hold promise for addressing medical challenges, such as medical diagnosis reasoning, research knowledge acquisition, clinical decision-making, and consumer health inquiry support. However, they often generate hallucinations due to limited medical knowledge. Incorporating external knowledge is therefore critical, which necessitates multi-source knowledge acquisition. We address this challenge by framing it as a source planning problem, which is to formulate context-appropriate queries tailored to the attributes of diverse sources. Existing approaches either overlook source planning or fail to achieve it effectively due to misalignment between the model's expectation of the sources and their actual content. To bridge this gap, we present MedOmniKB, a repository comprising multigenre and multi-structured medical knowledge sources. Leveraging these sources, we propose the Source Planning Optimisation method, which enhances multi-source utilisation. Our approach involves enabling an expert model to explore and evaluate potential plans while training a smaller model to learn source alignment. Experimental results demonstrate that our method substantially improves multi-source planning performance, enabling the optimised small model to achieve state-of-the-art results in leveraging diverse medical knowledge sources.
Yihang Zhou
The two main gonadal development disorders in dogs are true hermaphroditism and XX male syndrome. True hermaphroditism can be divided into two subcategories: XX sex reversal and XY sex reversal. XX Sry-negative sex reversal is more common, and it is characterized by the presence of both ovarian and testicular tissues in an animal. To date, there are 16 cases of true hermaphroditism reported in the literature, 15 of which are XX true hermaphroditism. Hermaphroditism has not been formally documented in labrador retrievers, and no case of asymmetric hermaphroditism has been reported in the literature.
Elvis Kimara, Kunle S. Oguntoye, Jian Sun
This paper introduces PersonaAI, a cutting-edge application that leverages Retrieval-Augmented Generation (RAG) and the LLAMA model to create highly personalized digital avatars capable of accurately mimicking individual personalities. Designed as a cloud-based mobile application, PersonaAI captures user data seamlessly, storing it in a secure database for retrieval and analysis. The result is a system that provides context-aware, accurate responses to user queries, enhancing the potential of AI-driven personalization. Why should you care? PersonaAI combines the scalability of RAG with the efficiency of prompt-engineered LLAMA3, offering a lightweight, sustainable alternative to traditional large language model (LLM) training methods. The system's novel approach to data collection, utilizing real-time user interactions via a mobile app, ensures enhanced context relevance while maintaining user privacy. By open-sourcing our implementation, we aim to foster adaptability and community-driven development. PersonaAI demonstrates how AI can transform interactions by merging efficiency, scalability, and personalization, making it a significant step forward in the future of digital avatars and personalized AI.
Shuhei Tomoshige, Hayato Muraki, Kenichi Oishi, Hitoshi Iyatomi
Current methods for searching brain MR images rely on text-based approaches,
highlighting a significant need for content-based image retrieval (CBIR)
systems. Directly applying 3D brain MR images to machine learning models offers
the benefit of effectively learning the brain's structure; however, building
the generalized model necessitates a large amount of training data. While
models that consider depth direction and utilize continuous 2D slices have
demonstrated success in segmentation and classification tasks involving 3D
data, concerns remain. Specifically, using general 2D slices may lead to the
oversight of pathological features and discontinuities in depth direction
information. Furthermore, to the best of the authors' knowledge, there have
been no attempts to develop a practical CBIR system that preserves the entire
brain's structural information. In this study, we propose an interpretable CBIR
method for brain MR images, named iCBIR-Sli (Interpretable CBIR with 2D Slice
Embedding), which, for the first time globally, utilizes a series of 2D slices.
iCBIR-Sli addresses the challenges associated with using 2D slices by
effectively aggregating slice information, thereby achieving low-dimensional
representations with high completeness, usability, robustness, and
interoperability, which are qualities essential for effective CBIR. In
retrieval evaluation experiments utilizing five publicly available brain MR
datasets (ADNI2/3, OASIS3/4, AIBL) for Alzheimer's disease and cognitively
normal, iCBIR-Sli demonstrated top-1 retrieval performance (macro F1 = 0.859),
comparable to existing deep learning models explicitly designed for
classification, without the need for an external classifier. Additionally, the
method provided high interpretability by clearly identifying the brain regions
indicative of the searched-for disease.
Authors' comments: 8 pages, 2 figures. Accepted at the SPIE Medical Imaging
Ruitao Pu, Yuan Sun, Yang Qin, Zhenwen Ren, Xiaomin Song, Huiming Zheng, Dezhong Peng
Cross-modal hashing (CMH) has appeared as a popular technique for cross-modal
retrieval due to its low storage cost and high computational efficiency in
large-scale data. Most existing methods implicitly assume that multi-modal data
is correctly labeled, which is expensive and even unattainable due to the
inevitable imperfect annotations (i.e., noisy labels) in real-world scenarios.
Inspired by human cognitive learning, a few methods introduce self-paced
learning (SPL) to gradually train the model from easy to hard samples, which is
often used to mitigate the effects of feature noise or outliers. It is a
less-touched problem that how to utilize SPL to alleviate the misleading of
noisy labels on the hash model. To tackle this problem, we propose a new
cognitive cross-modal retrieval method called Robust Self-paced Hashing with
Noisy Labels (RSHNL), which can mimic the human cognitive process to identify
the noise while embracing robustness against noisy labels. Specifically, we
first propose a contrastive hashing learning (CHL) scheme to improve
multi-modal consistency, thereby reducing the inherent semantic gap. Afterward,
we propose center aggregation learning (CAL) to mitigate the intra-class
variations. Finally, we propose Noise-tolerance Self-paced Hashing (NSH) that
dynamically estimates the learning difficulty for each instance and
distinguishes noisy labels through the difficulty level. For all estimated
clean pairs, we further adopt a self-paced regularizer to gradually learn hash
codes from easy to hard. Extensive experiments demonstrate that the proposed
RSHNL performs remarkably well over the state-of-the-art CMH methods.
Authors' comments: 9 pages, AAAI 25 conference
Shuyue Xue, Mohammad Maghrebi, George I. Mias, Carlo Piermarocchi
We study Hopfield networks with non-reciprocal coupling inducing switches between memory patterns. Dynamical phase transitions occur between phases of no memory retrieval, retrieval of multiple point-attractors, and limit-cycle attractors. The limit cycle phase is bounded by two critical regions: a Hopf bifurcation line and a fold bifurcation line, each with unique dynamical critical exponents and sensitivity to perturbations. A Master Equation approach numerically verifies the critical behavior predicted analytically. We discuss how these networks could model biological processes near a critical threshold of cyclic instability evolving through multi-step transitions.
Weiqi Wu, Shen Huang, Yong Jiang, Pengjun Xie, Fei Huang, Hai Zhao
In the fast-changing realm of information, the capacity to construct coherent timelines from extensive event-related content has become increasingly significant and challenging. The complexity arises in aggregating related documents to build a meaningful event graph around a central topic. This paper proposes CHRONOS - Causal Headline Retrieval for Open-domain News Timeline SummarizatiOn via Iterative Self-Questioning, which offers a fresh perspective on the integration of Large Language Models (LLMs) to tackle the task of Timeline Summarization (TLS). By iteratively reflecting on how events are linked and posing new questions regarding a specific news topic to gather information online or from an offline knowledge base, LLMs produce and refresh chronological summaries based on documents retrieved in each round. Furthermore, we curate Open-TLS, a novel dataset of timelines on recent news topics authored by professional journalists to evaluate open-domain TLS where information overload makes it impossible to find comprehensive relevant documents from the web. Our experiments indicate that CHRONOS is not only adept at open-domain timeline summarization, but it also rivals the performance of existing state-of-the-art systems designed for closed-domain applications, where a related news corpus is provided for summarization.
Chengcheng Mai, Yuxiang Wang, Ziyu Gong, Hanxiang Wang, Yihua Huang
Document-level relation extraction (Doc-RE) aims to extract relations between
entities across multiple sentences. Therefore, Doc-RE requires more
comprehensive reasoning abilities like humans, involving complex cross-sentence
interactions between entities, contexts, and external general knowledge,
compared to the sentence-level RE. However, most existing Doc-RE methods focus
on optimizing single reasoning ability, but lack the ability to utilize
external knowledge for comprehensive reasoning on long documents. To solve
these problems, a knowledge retrieval augmented method, named KnowRA, was
proposed with comprehensive reasoning to autonomously determine whether to
accept external knowledge to assist DocRE. Firstly, we constructed a document
graph for semantic encoding and integrated the co-reference resolution model to
augment the co-reference reasoning ability. Then, we expanded the document
graph into a document knowledge graph by retrieving the external knowledge base
for common-sense reasoning and a novel knowledge filtration method was
presented to filter out irrelevant knowledge. Finally, we proposed the axis
attention mechanism to build direct and indirect associations with intermediary
entities for achieving cross-sentence logical reasoning. Extensive experiments
conducted on two datasets verified the effectiveness of our method compared to
the state-of-the-art baselines. Our code is available at
https://anonymous.4open.science/r/KnowRA.
Authors' comments: This work has been accepted by IJCAI 2025 (CCF A)
Kayna L. Mendoza, Haoyang Ni, Georgios Varnavides, Miaofang Chi, Colin Ophus, Amanda Petford-Long, Charudatta Phatak
Magnetic materials phase reconstruction from Lorentz transmission electron
microscopy (LTEM) measurements has traditionally been achieved using
longstanding methods such as off-axis holography (OAH) and the
transport-of-intensity equation (TIE). Amidst the increase in access to
processing power and the development of advanced algorithms, phase retrieval of
nanoscale magnetic materials with higher fidelity and resolution, potentially
down to the few nanometer limit, becomes possible. Specifically, reverse-mode
automatic differentiation (RMAD) and the extended electron ptychography
iterative engine (ePIE) are two methods that have been utilized for high
confidence phase reconstructions using LTEM through-focal series imaging and
Lorentz scanning TEM (Ltz-4D-STEM), respectively. This work evaluates phase
retrieval using TIE, RMAD, and ePIE in simulations consisting of an array of
Permalloy (Ni80Fe20) nanoscale islands. Extending beyond simulations, we
demonstrate total phase reconstructions of a NiFe nanowire using OAH and RMAD
in LTEM and ePIE in Ltz-4D-STEM experiments and determine the magnetization
saturation through corroborations with micromagnetic simulations. Finally, we
show how the total phase shift gradient can be utilized to observe and
characterize the proximity effects emanating from neighboring magnetic island
interactions and an isolated NiFe nanowire.
Authors' comments: 14 pages, 5 figures, 3 supplementary figures, and government
copyright