Python script to convert pdf to markdown file

python

While working on AI Code Agent, we want to generate Terraform scripts by referring to pdf file. This pdf file contains the AWS setup information like VPC, Subnet Mask, Availability Zone etc. Currently these AI Code agents has pdf file size limitation hence we converted the pdf to markdown through our python script.

So the above was the scenario which lead me to write a python script which will convert pdf file to markdown.

Python Script: convert_pdf_markdown.py

We have git commit the code in our Github repo. Follow the README file in the repo for easy setup and for executing the python script.

Once you clone the git repo in your system. Do the following steps (command line is also provided in the README file)

  1. Create Python virtual environment
  2. Activate the Python virtual environment.
  3. Install the Python library which are listed in requirements.txt file with the help of pip command.
  4. Upload your pdf file in the directory called pdf.
  5. Execute the python script convert_pdf_markdown.py.
  6. In input , give pdf file name which you want to convert to markdown file.
  7. After successful execution, find the output file inside the markdown directory.

Watch this video to know how to execute the python script.

Git Repo : convert_pdf_markdown

For any information, question or appreciation, kindly drop your comments in the comment section.

Note: This is one of the step for our AI Coding Agent setup. In our next blog post, we will share about the AI Code agent.

Read Some More Articles