| View previous topic :: View next topic |
| Author |
Message |
distractor Joined: 18 Nov 2009 Posts: 58

Reputation: 19
|
Posted: Sun Dec 06, 2009 4:18 pm Post subject: |
|
|
Is possible to add creation wordlist from webpage ?
For example. I want extract words from http://en.wikipedia.org/wiki/Password_recovery to one word per line txt file
would be nice to add recursive downloading of selected level of links for folowing related keywords. |
|
| Back to top |
|
 |
 Admin Joined: 09 Nov 2005 Posts: 11263

Location: CCCP
|
Posted: Sun Dec 06, 2009 5:39 pm Post subject: |
|
|
| distractor wrote: | | Is possible to add creation wordlist from webpage ? | I will think about this. |
|
| Back to top |
|
 |
TZ- Joined: 16 Aug 2009 Posts: 115

Reputation: 39
|
Posted: Mon Dec 07, 2009 6:35 am Post subject: |
|
|
Accent Keyword Extractor is ok, and WikiWordlistCreator is good for those tasks.
Reference for admin (way of extracting ) and good to use until he made something
also, if is possible, to make plugin for extracting that work with Russian letters , not to extract words in form like çàáûë ïàðîëü , ìàãâàé94 etc but words on Russian, Cyrillic letters , utf-8 encoding or whatever encoding is. |
|
| Back to top |
|
 |
TZ- Joined: 16 Aug 2009 Posts: 115

Reputation: 39
|
Posted: Tue Dec 08, 2009 12:05 am Post subject: |
|
|
| Note about Accent Keyword Extractor : Likes to crash on big number of words ( at least on mine computer) but words are still saved, just run his out_all.dat (file where words are collected) through Once Is Enough or similar program for removing duplicates and work is saved. You can rename to .txt or .dic for future usage. |
|
| Back to top |
|
 |
distractor Joined: 18 Nov 2009 Posts: 58

Reputation: 19
|
Posted: Tue Dec 08, 2009 1:35 am Post subject: |
|
|
| TZ- wrote: | | Note about Accent Keyword Extractor : Likes to crash on big number of words |
Try to run in WINXP SP3 mode i think this helps. Works for me. |
|
| Back to top |
|
 |
TZ- Joined: 16 Aug 2009 Posts: 115

Reputation: 39
|
Posted: Tue Dec 08, 2009 7:59 am Post subject: |
|
|
I am on original xp pro sp3 , . sometimes I have few millions collected, he just crash.
good for crawling - http://rankings.big-boards.com/?sort=members&p=all
forum boards.
millions users, usernames, all languages , X subjects and variations of words... |
|
| Back to top |
|
 |
distractor Joined: 18 Nov 2009 Posts: 58

Reputation: 19
|
Posted: Sat Jan 09, 2010 2:41 pm Post subject: |
|
|
| Suggestion: remove words by mask/rule |
|
| Back to top |
|
 |
 Admin Joined: 09 Nov 2005 Posts: 11263

Location: CCCP
|
Posted: Sat Jan 09, 2010 2:53 pm Post subject: |
|
|
| distractor wrote: | | Suggestion: remove words by mask/rule | Yes, adding this feature is on my agenda. |
|
| Back to top |
|
 |
 Admin Joined: 09 Nov 2005 Posts: 11263

Location: CCCP
|
Posted: Fri Apr 02, 2010 7:08 pm Post subject: |
|
|
The plugin has been recompiled to be compatible with Pligins API v1.3.
Additionally in this version:
- Fixed the error that occurred when splitting dictionaries to files larger than 2GB. |
|
| Back to top |
|
 |
 Admin Joined: 09 Nov 2005 Posts: 11263

Location: CCCP
|
Posted: Wed Jul 14, 2010 6:33 pm Post subject: |
|
|
Plugin updated on 07/14/2010. In the new version:
- Fixed the error that occurred when sorting several files simultaneously. |
|
| Back to top |
|
 |
T3h sh4ng4i Joined: 15 Feb 2012 Posts: 5

Reputation: 1
|
Posted: Wed Feb 29, 2012 6:22 pm Post subject: |
|
|
| How to download this and install can anyone help me? |
|
| Back to top |
|
 |
 passcape Joined: 09 Dec 2005 Posts: 69

Reputation: 13
Location: CCCP
|
Posted: Thu Jul 19, 2012 2:50 pm Post subject: |
|
|
Accent Keyword Extractor is a little bit buggy and crashes time to time. Besides it is absolutely useless to use it if you need to create a good worlist, for it may take years to download the stuff.
Just download Wikipedia in one archive instead (can't find the link, so use Google), unpack it and feed it to a 3-d party parsing tool. I managed to create 2 Gb multilangual wordlists out of the whole Wikipedia archive. By the way, the database parsing took several days to complete. |
|
| Back to top |
|
 |
|