PHP, split .docx into paragraphs (PREG_SPLIT , "/\.\n/u") -


i trying read docx , **split parts first using docxconversion class ** first function read_docx

private function read_docx(){     $striped_content = '';     $content = '';      $zip = zip_open($this->filename);      if (!$zip || is_numeric($zip)) return false;     $zip_entry = zip_read($zip);     while ($zip_entry = zip_read($zip)) {         if (zip_entry_open($zip, $zip_entry) == false) continue;         if (zip_entry_name($zip_entry) != "word/document.xml") continue;         $content .= zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));                      zip_entry_close($zip_entry);     }// end while      zip_close($zip);             $content = str_replace('</w:r></w:p></w:tc><w:tc>', " ", $content);     $content = str_replace('</w:r></w:p>', "\r\n", $content);     $content = str_replace("</w:p>", "\r\n", $content);      $pattern = "/(الفقرة\s[0-9]+\s)|(الأولى الفقرة)./u";//المادة\s*\d:      $striped_content = strip_tags($content);                    $splitted_para_arr = preg_split($pattern,$striped_content,null,preg_split_no_empty);      return $splitted_para_arr;//striped_content; } 

the second function convert text

    public function converttotext() {       if(isset($this->filename) && !file_exists($this->filename)) {         return "file not exists";     }     $filearray = pathinfo($this->filename);     $file_ext  = $filearray['extension'];                 if($file_ext == "docx") {             return $this->read_docx();                     } else {             return "invalid file type";         }   }  

then split each part paragraphs using following function

public function getparag($article){     $splitted_para_arr = preg_split("/\.\n/u",$article,null,preg_split_no_empty);             return $splitted_para_arr;//striped_content; } 

but problem here can't paragraphs following pattern "/.\n/u"

if want deal kind of linebreak, use \r:

$splitted_para_arr = preg_split("/\.\r/", $article, null, preg_split_no_empty);    

\r matches \n, \r or \r\n


Comments

Popular posts from this blog

sql - invalid in the select list because it is not contained in either an aggregate function -

Angularjs unit testing - ng-disabled not working when adding text to textarea -

How to start daemon on android by adb -