Reasoning in NLP

Feb 24, 2016

Models that claim to understand language, should also be able to demonstrate its abilities to reason across various dimensions. My present goal is to evaluate, enhance and explain the reasoning capabilities of such systems (or language models).

!!NEW!! Reasoning in LLMs

Our group has invested significantly in advancing reasoning abilities of LLMs in a multi-hop setting. The following drafts are in progress: 1) RelSelect$^+$: Efficient Leaf Selection to Improve Entailment Tree Generation, 2) LogicPO: Efficient Translation of NL-based Logical Problems to FOL using LLMs and Preference Optimization (under review COLM 2025), and 3) Multi-step Logical Reasoning under Incomplete Knowledge.

References

Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs, Haritz Puerto¹, Martin Tutek¹, Somak Aditya², Xiaodan Zhu^1,3, Iryna Gurevych¹ ¹Ubiquitous Knowledge Processing Lab (UKP Lab),TU Darmstadt and Hessian Center for AI (hessian.AI) ²IIT Kharagpur, ²Queen’s University, EMNLP 2024 (Main) !!NEW!!
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models, Man Luo^1,2, Shrinidhi Kumbhar¹, Ming shen,¹ Mihir Parmar¹, Neeraj Varshney¹, Pratyay Banerjee³, Somak Aditya⁴, Chitta Baral¹, ¹Arizona State University, ²Mayo Clinic, ³Amazon Alexa AI, ⁴IIT KGP ArXiv Nov 2023 !!NEW!!
Generating Intermediate Steps for NLI with Next-Step Supervision, AACL-IJCNLP 2023, Main !!NEW!!

Natural Language Inference

Large pre-trained language models show high performance in popular NLP benchmarks (GLUE, SuperGLUE), while failing poorly in datasets with targeted linguistic and logical phenomena. We consolidate the interesting reasoning phenomena in Taxonomy of reasoning w.r.t the NLI task. Our first work along this line published in CoNLL 2020 showed that these models (BERT, RoBERTa) may not know how to perform certain types of reasoning such as causal, numeric, spatial, temporal; but they can identify the type of reasoning required for a new example.

We did a follow-up, adapting the CheckList methodology, where we create a large CheckList-NLI dataset to individually yet collectively test different reasoning capabilities, including pragmatic ones. Through our test-suite, we show that such a post-hoc evaluation provides a more comprehensive overview of the behavioral nature of the language models. A thorough human study with Linguistic Olympiad participants shows that behavioral summary leads to better explanation and RoBERTa’s behavior is more predictable than BERT. Currently, we are also exploring augmenting NLI datasets with verifiable proofs.

Summary and Extensions:

TaxiNLI: Taxonomic Fragmentation of the NLI Task, CoNLL 2020
TaxiXNLI: Multi-lingual Extension of TaxiNLI, EMNLP 2021 MRL Workshop
LoNLI: Testing Diverse Reasoning of NLI Systems, LREV 2023, In Print, !!NEW!!

Enhancing NLI: Multi-hop, Causality and Counterfactuals & Reasoning in LLMs

As observed through TaxiNLI family of work, language models struggle with many important reasoning types. With Deepanway Ghoshal and Monojit choudhury, we explored a less annotation-intensive way to generate intermediate steps for complex reasoning examples in free-form NLI datasets. We observe, not only, we can generate such multi-hop steps without end-to-end supervision; but the steps are accurate as they can be augmented directly to improve NLI model's predictive ability.

References

Generating Intermediate Steps for NLI with Next-Step Supervision, AACL-IJCNLP 2023, Main !!NEW!!

Previously I have been interested in mapping natural language to formal language representation and reasoning with it. My proposed solutions towards Question-Answering and Winograd Schema Challenge during my Ph.D have been motivated by the central idea of semantic parsing, followed by logical (or probabilistic logical) reasoning.

Semantic Parsing (K-Parser)

We (led by co-authors Arpit Sharma and Nguyen Vo) have explored mapping of natural language to formal representation, that enbales logical reasoning. Through several papers (K-Parser IJCAI-15, K-Parser NAACL 15), we showed how such semantic parsing enables us to find event mentions, and (even patially but interpretably) solved Winograd Schema challenge problems.

Somak Aditya

Assistant Professor

My research interests include integrating knowledge and enabling higher-order reasoning in AI.

Publications

EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos | In EMNLP 2025 (Main).
Sourjyadip Ray, Shubham Sharma, Somak Aditya, Pawan Goyal (2025).

PDF nlp vision

LOGICPO: Efficient Translation of NL-based Logical Problems to FOL using LLMs and Preference Optimization | In ArXiv 2025.
Koushik Viswanadha, Deepanway Ghoshal, Somak Aditya (2025).

PDF Code nlp neurosymbolic

STUCK IN THE QUICKSAND OF NUMERACY, FAR FROM AGI SUMMIT: EVALUATING LLMS' MATHEMATICAL COMPETENCY THROUGH ONTOLOGY-GUIDED PERTURBATIONS | In ACL 2025 (Findings).
Pengfei Hong, Deepanway Ghoshal, Navonil Majumdar, Somak Aditya, Rada Mihalcea, Soujanya Poria (2025).

PDF Code nlp symbolicmath

SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation | In NAACL 2025 (Main).
Saurabh Kumar Pandey, Sachin Vashishtha, Somak Aditya, Monojit Choudhury (2025).

PDF nlp

Code Prompting Elicits Conditional Reasoning Abilities in Text+ Code LLMs | In EMNLP 2024 (Main).
Haritz Puerto, Martin Tutek, Somak Aditya, Xiaodan Zhu, Iryna Gurevych (2024).

PDF Code nlp

ERVQA: A Dataset to Benchmark the Readiness of Large Vision Language Models in Hospital Environments | In EMNLP 2024 (Main).
Sourjyadip Ray, Kushal Gupta, Soumi Kundu, Dr Payal Arvind Kasat, Somak Aditya, Pawan Goyal (2024).

PDF nlp vision

Text2Afford: Probing Object Affordance Prediction abilities of Language Models solely from Text | In CONLL 2024 (Main).
Sayantan Adak, Daivik Agarwal, Animesh Mukherjee, Somak Aditya (2024).

PDF nlp vision

MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning | In NAACL 2024 (Main, In Print).
Debrup Das, Debopriyo Banerjee, Somak Aditya, Ashish Kulkarni (2024).

PDF Code Project nlp neurosymbolic

Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks | In LREC-COLING 2024 (In Print).
Abhinav Rao, Sachin Vashistha, Atharva Naik, Somak Aditya, Monojit Choudhury (2024).

PDF Code nlp

Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models | In ArXiv 2023.
Man Luo, Shrinidhi Kumbhar, Ming shen, Mihir Parmar, Neeraj Varshney, Pratyay Banerjee, Somak Aditya, Chitta Baral (2023).

PDF Code Dataset nlp

Prover: Generating Intermediate Steps for NLI with Commonsense Knowledge Retrieval and Next-Step Prediction | In AACL-IJCNLP 2023 (Main).
Deepanway Ghoshal, Somak Aditya, Monojit Choudhury (2023).

PDF Code Poster nlp neurosymbolic

SYNC: A Structurally guided Hard Negative Curricula for Efficient Neural Code Search | In AACL-IJCNLP 2023 (Main).
Atharva Naik, Soumitra Das, Jyothi Vedurada, Somak Aditya (2023).

PDF Code nlp neurosymbolic

LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI | In LREV 2023 (In Print).
Ishan Tarunesh, Somak Aditya, Monojit Choudhury (2023).

PDF Dataset nlp

A Robust Information-Masking Approach for Domain Counterfactual Generation | In ACL 2023 (Long Paper Findings).
Pengfei Hong, Rishabh Bhardwaj, Navonil Majumdar, Somak Aditya, Soujanya Poria (2023).

PDF Code nlp

Multilingual CheckList: Generation and Evaluation | In AACL-IJCNLP 2022 (Long Paper Findings).
Karthikeyan K, Shaily Bhatt, Pankaj Singh, Somak Aditya, Sandipan Dandapat, Sunayana Sitaram, Monojit Choudhury (2022).

PDF nlp

Vector Space Interpolation for Query Expansion | In AACL-IJCNLP 2022 (Short Paper).
Deepanway Ghoshal, Somak Aditya, Sandipan Dandapat, Monojit Choudhury (2022).

PDF nlp

LITMUS Predictor: An AI Assistant for Building Reliable, High-Performing and Fair Multilingual NLP Systems | In AAAI 2022 Demonstrations.
Anirudh Srinivasan, Gauri Kholkar, Rahul Kejriwal, Tanuja Ganu, Sandipan Dandapat, Sunayana Sitaram, Balakrishnan Santhanam, Somak Aditya, Kalika Bali, Monojit Choudhury (2022).

PDF Project nlp

Analyzing the Effects of Reasoning Types on Cross-Lingual Transfer Performance | In EMNLP 2021 MRL Workshop.
Karthikeyan K, Aalok Sathe, Somak Aditya, Monojit Choudhury (2021).

PDF Dataset nlp

Predicting joint intent-slot structure | In USPTO 2021.
Somak Aditya, Sharmila Nangi Reddy, Pranil Joshi, Kushal Chawla, Bhavy Khatri, Abhinav Mishra (2021).

PDF nlp

Trusting RoBERTa over BERT: Insights from CheckListing the Natural Language Inference Task | In ArXiv 2021.
Ishan Tarunesh, Somak Aditya, Monojit Choudhury (2021).

PDF Dataset nlp

Creating a knowledge graph based on text-based knowledge corpora | In USPTO 2021.
Somak Aditya, Atanu Sinha (2021).

PDF nlp

TaxiNLI: Taking a Ride up the NLU Hill | In CoNLL 2020.
Pratik Joshi*, Somak Aditya*, Aalok Sathe*, Monojit Choudhury (2020).

PDF Slides Dataset nlp

Uncovering Relations for Marketing Knowledge Representation | In AAAI 2020, StarAI Workshp.
Somak Aditya, Atanu Sinha (2020).

PDF nlp

Integrating Knowledge and Reasoning in Image Understanding | In IJCAI 2019.
Somak Aditya, Yezhou Yang, Chitta Baral (2019).

PDF vision nlp

Spatial Knowledge Distillation to aid Visual Reasoning | In IEEE WACV 2019.
Somak Aditya, Rudra Saha, Yezhou Yang, Chitta Baral (2019).

PDF vision nlp neurosymbolic

Explicit Reasoning over End-to-End Neural Architectures | In AAAI 2018.
Somak Aditya, Yezhou Yang, Chitta Baral (2018).

PDF Code Project vision nlp neurosymbolic