Comparative Performance Analysis of Text Summarization: A Case Study of Extractive (TF-IDF, TextRank) and Abstractive (LLM) Methods

Penulis

  • Yustia Hapsari Universitas Pancasakti Tegal
  • Muhammad Fikri Hidayattullah University of Harkat Negeri
  • Rifyal Aidil Dziaul Haq University of Harkat Negeri

DOI:

https://doi.org/10.46808/iitp.v3i1.95

Kata Kunci:

Comparative Analysis, Abstractive Method, Extractive Method, Text Summarization, Natural Language Processing

Abstrak

This study presents a comparative performance analysis of two major paradigms in text summarization. The extractive paradigm, which operates by selecting significant sentences directly from the source text, is implemented through two approaches: (1) the statistical TF-IDF algorithm, which quantitatively scores sentences based on accumulated word significance weights; and (2) the graph-based TextRank algorithm, which represents sentences as nodes and determines their importance through centrality analysis within a semantic network. Representing the abstractive paradigm, the Large Language Model (LLM) Gemini is employed, which comprehends contextual information holistically to generate entirely new and coherent summary sentences. A qualitative comparative analysis of the outputs from these three methods reveals a fundamental trade-off. The abstractive method (Gemini) demonstrates superior performance in terms of narrative quality, producing summaries that are highly coherent, fluent, and natural-sounding, resembling human writing. Conversely, the extractive methods (TF-IDF and TextRank) inherently excel in ensuring perfect factual consistency, as there is no risk of misinterpretation or hallucinated information. Among the extractive methods, analysis indicates that TextRank tends to produce more structured and readable summaries compared to TF-IDF, owing to its ability to consider inter-sentence relationships. This study concludes that the choice of summarization method should be aligned with the specific priorities of the use case: abstractive methods are better suited for readability-focused tasks, whereas extractive methods are preferable for applications demanding absolute factual reliability.

Diterbitkan

2025-10-06

Terbitan

Bagian

Articles