Unit 3
In this unit, you will learn how text is extracted from PDFs, especially with Optical Character Recognition. We will learn the theory of how Natural Language Processing (NLP) and LLM tools extract data from text. In the interactive session, you may discuss obstacles you are facing with your own project.