- 1,251 categories
- 40,829 scripts
Updated: 04/30/2006
- Building indexers/spiders that can read binary MS Word (.doc) documents can be difficult, expecially on *nix servers, which don''t support PHP''s COM abilities.
Solutions usually involve installing binaries on the server (often impossible or disallowed).
This simple PHP snippet makes a pretty good job of extracting text from an MS Word document for use in a search index. While not pretending to be perfect, it has proved itself useful on thousands of test documents.
- Categories
- Platforms
- Licenses
- Author