Legal disclaimer >>> The information on this site is intended to be used for legal and ethical purposes like research, education, journalism and educating the public. Our intention is to comply with any and all applicable laws. If you can provide legal advice, please let us know.

Contribute >>> Have new or missing information? See something wrong? Use the comment section at the bottom of all pages, email or Twitter.

Stay up to date >>> Follow us on Twitter.

Parsing and filtering the YouPorn password leak

This article is about parsing the leaked files related to the YouPorn incident of February 2012. You can read about the incident here.

Parsing and importing into MySQL

The interesting information seem to only be formatted in one of two ways. This is how you could extract that information and insert it into a MySQL table:

parse.php
<?php
$db = new PDO('mysql:dbname=cracking;host=127.0.0.1', 'root', '');
$db->query('SET NAMES utf8mb4');
$db->beginTransaction();
$fileCount = 1;
foreach (glob('*.log') as $filename) {
    $i = 0;
    $string = file_get_contents($filename);
    if (preg_match_all('/username=(.*)\semail=(.*)\spassword=(.*)\spassword_confirm=(.*)\s/', $string, $matches)) {
        $statement = $db->prepare("INSERT INTO `cracking`.`2012-02-22-youporn` VALUES (?, ?, NULL, ?, ?)");
        foreach ($matches[1] as $k => $v) {
            $statement->execute(array($matches[1][$k], $matches[2][$k], $matches[3][$k], $matches[4][$k]));
            $i++;
        }
    }
    if (preg_match_all('/username=(.*)\semail=(.*)\semail_confirm=(.*)\spassword=(.*)\spassword_confirm=(.*)\s/', $string, $matches)) {
        $statement = $db->prepare("INSERT INTO `cracking`.`2012-02-22-youporn` VALUES (?, ?, ?, ?, ?)");
        foreach ($matches[1] as $k => $v) {
            $statement->execute(array($matches[1][$k], $matches[2][$k], $matches[3][$k], $matches[4][$k], $matches[5][$k]));
            $i++;
        }
    }
    echo 'File #', $fileCount, " (", $filename, "). ", "Executed ", $i, " statements. \n";
    $fileCount++;
}
$db->commit();

For this to work you obviously need a table like this:

table.sql
CREATE TABLE `cracking`.`2012-02-22-youporn` (
  `username` blob,
  `email` blob,
  `email_confirm` blob,
  `password` blob,
  `password_confirm` blob,
  KEY `username` (`username`(128)),
  KEY `email` (`email`(128)),
  KEY `email_confirm` (`email_confirm`(128)),
  KEY `password` (`password`(128)),
  KEY `password_confirm` (`password_confirm`(128))
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

Filtering information

We want to exclude form submits that resulted in an error and did not actually create a new user. Let's fetch account creation attempts where:

  • password = password_confirm
  • excluding missing password
  • excluding missing email address
  • excluding duplicate combinations of email and password
filter.sql
SELECT
CONVERT(username USING utf8mb4),
CONVERT(email USING utf8mb4),
CONVERT(email_confirm USING utf8mb4),
CONVERT(password USING utf8mb4),
CONVERT(password_confirm USING utf8mb4)
FROM `cracking`.`2012-02-22-youporn`
WHERE password = password_confirm AND password != '' AND email != ''
GROUP BY email, password

Exporting passwords and running statistical analysis

You could also export the passwords to a file to run statistical analysis tools, like Passpal 0.4, on:

export.sql
SELECT password
FROM `cracking`.`2012-02-22-youporn`
WHERE password = password_confirm AND password != '' AND email != ''
GROUP BY email, password
INTO OUTFILE 'C:/passwords.txt'

Passpal 0.4 dump

Running Passpal you might get a report like this (here's a link to the report by itself):

passpal 0.4 report (www.thepasswordproject.com)

Total words: 	1566156
Unique words: 	833994 (53.25 %)

Word frequency, sorted by count, top 10
+------------------------------+
|   Word    | Count | Of total |
+------------------------------+
| 123456    | 37981 | 2.4251 % |
| 123456789 | 10415 | 0.665 %  |
| 12345     |  6921 | 0.4419 % |
| 1234      |  4725 | 0.3017 % |
| password  |  4361 | 0.2785 % |
| 12345678  |  3235 | 0.2066 % |
| qwerty    |  3203 | 0.2045 % |
| 1234567   |  2982 | 0.1904 % |
| 111111    |  2354 | 0.1503 % |
| 123       |  2277 | 0.1454 % |
+------------------------------+


Base word (len>=3) frequency, sorted by count, top 10
+-----------------------------+
|   Word   | Count | Of total |
+-----------------------------+
| password |  5826 | 0.372 %  |
| qwerty   |  4230 | 0.2701 % |
| youporn  |  2601 | 0.1661 % |
| sex      |  1928 | 0.1231 % |
| abc      |  1873 | 0.1196 % |
| pussy    |  1838 | 0.1174 % |
| fuckyou  |  1499 | 0.0957 % |
| ficken   |  1339 | 0.0855 % |
| sexy     |  1305 | 0.0833 % |
| love     |  1264 | 0.0807 % |
+-----------------------------+


Length frequency, sorted by length, full table
+-----------------------------+
| Length | Count  | Of total  |
+-----------------------------+
|      0 |      1 | 0.0001 %  |
|      1 |    865 | 0.0552 %  |
|      2 |   3164 | 0.202 %   |
|      3 |  15949 | 1.0184 %  |
|      4 |  65196 | 4.1628 %  |
|      5 |  81589 | 5.2095 %  |
|      6 | 413271 | 26.3876 % |
|      7 | 255748 | 16.3297 % |
|      8 | 325944 | 20.8117 % |
|      9 | 173534 | 11.0802 % |
|     10 | 118276 | 7.552 %   |
|     11 |  50045 | 3.1954 %  |
|     12 |  29770 | 1.9008 %  |
|     13 |  13806 | 0.8815 %  |
|     14 |   8314 | 0.5309 %  |
|     15 |   4539 | 0.2898 %  |
|     16 |   2920 | 0.1864 %  |
|     17 |   1074 | 0.0686 %  |
|     18 |    747 | 0.0477 %  |
|     19 |    321 | 0.0205 %  |
|     20 |    358 | 0.0229 %  |
|     21 |    156 | 0.01 %    |
|     22 |    105 | 0.0067 %  |
|     23 |     68 | 0.0043 %  |
|     24 |     47 | 0.003 %   |
|     25 |     37 | 0.0024 %  |
|     26 |     32 | 0.002 %   |
|     27 |     23 | 0.0015 %  |
|     28 |     10 | 0.0006 %  |
|     29 |     12 | 0.0008 %  |
|     30 |      8 | 0.0005 %  |
|     31 |     11 | 0.0007 %  |
|     32 |    216 | 0.0138 %  |
+-----------------------------+


Charset frequency, sorted by count, full table
+-------------------------------------------------------------------------+
|           Charset            |  Count  | Of total  |   Count/keyspace   |
+-------------------------------------------------------------------------+
| lower-upper-numeric-symbolic | 1560739 | 99.6541 % | 16428.831578947367 |
| lower-upper-numeric          | 1532315 | 97.8392 % |  24714.75806451613 |
| lower-numeric-symbolic       | 1476935 | 94.3032 % | 21404.855072463768 |
| lower-numeric                | 1454060 | 92.8426 % | 40390.555555555555 |
| lower-upper-symbolic         |  763053 | 48.7214 % |  8977.094117647059 |
| lower-upper                  |  750871 | 47.9436 % | 14439.826923076924 |
| lower-symbolic               |  722999 | 46.1639 % |  12254.22033898305 |
| lower                        |  712706 | 45.5067 % |  27411.76923076923 |
| upper-numeric-symbolic       |  339397 | 21.6707 % |  4918.797101449275 |
| upper-numeric                |  336290 | 21.4723 % |  9341.388888888889 |
| numeric-symbolic             |  309025 | 19.7314 % |  7186.627906976744 |
| numeric                      |  307021 | 19.6035 % |            30702.1 |
| upper-symbolic               |   17832 | 1.1386 %  | 302.23728813559325 |
| upper                        |   16904 | 1.0793 %  |  650.1538461538462 |
| symbolic                     |     434 | 0.0277 %  | 13.151515151515152 |
+-------------------------------------------------------------------------+

Charset frequency, sorted by count/keyspace, full table
+-------------------------------------------------------------------------+
|           Charset            |  Count  | Of total  |   Count/keyspace   |
+-------------------------------------------------------------------------+
| lower-numeric                | 1454060 | 92.8426 % | 40390.555555555555 |
| numeric                      |  307021 | 19.6035 % |            30702.1 |
| lower                        |  712706 | 45.5067 % |  27411.76923076923 |
| lower-upper-numeric          | 1532315 | 97.8392 % |  24714.75806451613 |
| lower-numeric-symbolic       | 1476935 | 94.3032 % | 21404.855072463768 |
| lower-upper-numeric-symbolic | 1560739 | 99.6541 % | 16428.831578947367 |
| lower-upper                  |  750871 | 47.9436 % | 14439.826923076924 |
| lower-symbolic               |  722999 | 46.1639 % |  12254.22033898305 |
| upper-numeric                |  336290 | 21.4723 % |  9341.388888888889 |
| lower-upper-symbolic         |  763053 | 48.7214 % |  8977.094117647059 |
| numeric-symbolic             |  309025 | 19.7314 % |  7186.627906976744 |
| upper-numeric-symbolic       |  339397 | 21.6707 % |  4918.797101449275 |
| upper                        |   16904 | 1.0793 %  |  650.1538461538462 |
| upper-symbolic               |   17832 | 1.1386 %  | 302.23728813559325 |
| symbolic                     |     434 | 0.0277 %  | 13.151515151515152 |
+-------------------------------------------------------------------------+


Hashcat mask frequency, sorted by count, top 10
+------------------------------------------------------------------------+
|           Mask           | Count  | Of total  |     Count/keyspace     |
+------------------------------------------------------------------------+
| ?l?l?l?l?l?l             | 194284 | 12.4051 % |  0.0006289222341302504 |
| ?d?d?d?d?d?d             | 136252 | 8.6998 %  |               0.136252 |
| ?l?l?l?l?l?l?l           | 131012 | 8.3652 %  | 1.6311640480682596e-05 |
| ?l?l?l?l?l?l?l?l?l?l?l?l | 129221 | 8.2508 %  |  1.354106809090646e-12 |
| ?l?l?l?l?l?l?l?l?l       |  67597 | 4.3161 %  | 1.2449940914810972e-08 |
| ?l?l?l?l?l               |  54885 | 3.5044 %  |   0.004619414451659471 |
| ?d?d?d?d?d?d?d?d?l?l?l?l |  51634 | 3.2969 %  |  1.129906165750499e-09 |
| ?l?l?l?l?l?l?l?l?l?l     |  42464 | 2.7114 %  | 3.0080664196893873e-10 |
| ?l?l?l?l?l?l?d?d?l?l?l?l |  39088 | 2.4958 %  | 2.7689172054638933e-12 |
| ?l?l?l?l?l?l?l?l         |  35665 | 2.2772 %  | 1.7078724959532325e-07 |
+------------------------------------------------------------------------+
Words that didn't match any ?l?u?d?s mask: 5417 (0.3459 %)

Hashcat mask frequency, sorted by count/keyspace, top 10
+-------------------------------------------------+
|  Mask  | Count | Of total |   Count/keyspace    |
+-------------------------------------------------+
| ?d     |   279 | 0.0178 % |                27.9 |
| ?l     |   534 | 0.0341 % |   20.53846153846154 |
| ?d?d   |   655 | 0.0418 % |                6.55 |
| ?d?d?d |  3849 | 0.2458 % |               3.849 |
| ?l?l   |  2215 | 0.1414 % |   3.276627218934911 |
| ?s     |    29 | 0.0019 % |  0.8787878787878788 |
| ?u     |    19 | 0.0012 % |  0.7307692307692307 |
| ?l?l?l | 10982 | 0.7012 % |  0.6248293126991352 |
| ?l?d   |    62 | 0.004 %  | 0.23846153846153847 |
| ?u?u   |   132 | 0.0084 % |  0.1952662721893491 |
+-------------------------------------------------+
Words that didn't match any ?l?u?d?s mask: 5417 (0.3459 %)


Charset distribution of characters in beginning and end of words (len>=6)
+-------------------------------------------------------------------------------------------------+
| Charset\Index | 0 (first char) |     1     |     2     |    -3     |    -2     | -1 (last char) |
+-------------------------------------------------------------------------------------------------+
| lower         | 72.4842 %      | 75.0555 % | 74.7783 % | 65.2347 % | 55.6883 % | 50.3243 %      |
| upper         | 4.976 %        | 1.9758 %  | 1.9396 %  | 1.6236 %  | 1.4058 %  | 1.3221 %       |
| digits        | 22.3639 %      | 22.7691 % | 23.0835 % | 32.7727 % | 42.5487 % | 47.6496 %      |
| symbols       | 0.1759 %       | 0.1996 %  | 0.1986 %  | 0.369 %   | 0.3572 %  | 0.7039 %       |
+-------------------------------------------------------------------------------------------------+


Total characters: 	11727903
Unique characters: 	356
Top 50 characters: 	ae1oirns2l0t3m9c45d68uh7bkpgyfwjvxzqAESORINLMTCDBP

Character frequency, sorted by count, top 10
+-------------------------------+
| Character | Count  | Of total |
+-------------------------------+
| a         | 880245 | 7.5056 % |
| e         | 754693 | 6.435 %  |
| 1         | 648644 | 5.5308 % |
| o         | 583786 | 4.9778 % |
| i         | 534381 | 4.5565 % |
| r         | 514377 | 4.3859 % |
| n         | 508634 | 4.337 %  |
| s         | 490661 | 4.1837 % |
| 2         | 461183 | 3.9324 % |
| l         | 435806 | 3.716 %  |
+-------------------------------+


Symbol frequency, sorted by count, top 10
+----------------+
| Symbol | Count |
+----------------+
| .      |  8299 |
| !      |  5150 |
| _      |  4819 |
| -      |  4616 |
|        |  4332 |
| @      |  4007 |
| *      |  2326 |
| #      |  1623 |
| /      |  1291 |
| ,      |  1269 |
+----------------+
Print/export