核心功能 - 将PDF转成音频文件

脚本可以将 pdf 转换为音频文件,原理也很简单,首先用 PyPDF 提取 pdf 中的文本,然后用 Pyttsx3 将文本转语音。

实现代码

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33


import pyttsx3,PyPDF2 


pdfreader = PyPDF2.PdfFileReader(open('story.pdf','rb')) 


speaker = pyttsx3.init() 


for page_num in range(pdfreader.numPages):    


    text = pdfreader.getPage(page_num).extractText()  ## extracting text from the PDF 


    cleaned_text = text.strip().replace('\n',' ')  ## Removes unnecessary spaces and break lines 


    print(cleaned_text)                ## Print the text from PDF 


    #speaker.say(cleaned_text)        ## Let The Speaker Speak The Text 


    speaker.save_to_file(cleaned_text,'story.mp3')  ## Saving Text In a audio file 'story.mp3' 


    speaker.runAndWait() 


speaker.stop()