How to crack a captcha

Published on August 22nd, 2008

Ingredients:

- a pligg captcha
- gocr
- ImageMagick
- PHP

You’ve probably already heard about pligg, it’s a digg clone which is pretty easy to install. That means a lot of people already have it running on their domain (free backlinks with different IPs and PageRank!).

Google: inurl:live_comments.php

Pligg demands you enter the right captcha answer at sign up. Once you’re signed up however, you no longer need to enter captcha’s. So if you’re like me, you sign up for the sites, keep the passwords and logins in a nice list, put a small bot together and throw a couple of links to the pligg sites.

If however, you want to sign up for a lot of pligg sites and don’t want to do it manually you’ll have to crack the captcha pligg provides. Lukily the captcha is pretty easy to crack since there is enough space between all the numbers.

So we start by downloading the captcha through PHP with wget, you can fetch the captcha’s URL with fsockopen/curl on the sign-up page:

  1. exec("/usr/bin/wget -O /home/cracker/captcha.jpg http://www.pliggsite.com/ts_image.php?ts_random=[xxxx] . " > /dev/null"));

captcha image

Okay so we downloaded the captcha and saved it. Now let’s perform some ImageMagick actions to clear the noise in the background. We’ll use a floodfill to find the background color and then remove it from the image.

  1. exec("/usr/local/bin/convert /home/cracker/captcha.jpg -quality 100 /home/cracker/captcha.jpg");
  2. //optimize
  3. exec("/usr/local/bin/convert /home/cracker/captcha.jpg -fuzz 25000 -fill black -draw ‘color 5,5 floodfill’ -quality 100 /home/cracker/captcha_c.jpg");
  4. //floodfill and fill bg in black
  5. exec("/usr/local/bin/convert /home/cracker/captcha_c.jpg -negate -quality 100 /home/cracker/captcha_h.jpg");
  6. //negate the picture
  7.  
  8. exec("/usr/local/bin/convert /home/cracker/captcha_h.jpg -shave 10×10 -quality 100 /home/cracker/captcha_cracked.jpg");
  9. //remove the border

Now we have a nice image with no background noise:

image clean

Ok so that’s looking pretty good. However we need to optimize this image a bit. First of all, for gocr to work properly we’ll need to add more space in between the numbers. We also should make the numbers bolder so gocr will recognize it better/faster.

  1. $im = imagecreatefromjpeg(‘captcha_cracked.jpg’);
  2.  
  3. $width = imagesx($im);
  4. $height = imagesy($im);
  5.  
  6. $x = 0;
  7. $y = 0;
  8.  
  9. $new = imagecreate($width+300, $height+200); //make space for the extra spacing
  10. $white = imagecolorallocate($new, 255, 255, 255);
  11. $black = imagecolorallocate($new, 0, 0, 0);
  12. $start = false;
  13. $hitfound = 0;
  14. $newx = 0;
  15. $newy = 0;
  16. $lastx = 0;
  17. while ($x < $width) {
  18. $y = 0;
  19.  
  20. $foundblack = 0;
  21.  
  22. while ($y < $height) {
  23. if ((16777215 - imagecolorat($im, $x, $y)) > 1211142 ) {
  24. $newy = $y;
  25. imagesetpixel($new, $newx, $newy, $black);
  26. imagesetpixel($new, $newx+1, $newy, $black);
  27. imagesetpixel($new, $newx-1, $newy, $black);
  28.  
  29. $foundblack++;
  30. } else {
  31. if ( (aboveme($im, $x, $y) && belowme($im, $x, $y))    ) {
  32. imagesetpixel($new, $newx, $newy, $black);
  33. $foundblack++;
  34. } else {
  35. imagesetpixel($new, $newx, $newy, $white);
  36. }
  37. }
  38. $y++; $newy++;
  39. }
  40. if ( ($foundblack < 2) && ($start) && ($hitfound < 7) && ($x > $lastx+6) ) {
  41. $newx += 50;
  42. $lastx = $x;
  43. $hitfound++;
  44. }
  45. if (($foundblack > 0) && (!$start)) $start = true;
  46. $x++; $newx++;
  47. }
  48. imagejpeg($new, ‘old.jpg’, 100);
  49.  
  50. //round 2
  51.  
  52. $im = imagecreatefromjpeg(‘old.jpg’);
  53.  
  54. $width = imagesx($im);
  55. $height = imagesy($im);
  56.  
  57. $x = 0;
  58. $y = 0;
  59.  
  60. $new = imagecreate($width+200, $height+200);
  61. $white = imagecolorallocate($new, 255, 255, 255);
  62. $black = imagecolorallocate($new, 0, 0, 0);
  63. $start = false;
  64. $hitfound = 0;
  65. $newx = 0;
  66. $newy = 0;
  67.  
  68. while ($x < $width) {
  69. $y = 0;
  70.  
  71. $foundblack = 0;
  72.  
  73. while ($y < $height) {
  74. if ((16777215 - imagecolorat($im, $x, $y)) > 2211142 ) {
  75. $newy = $y;
  76. imagesetpixel($new, $newx, $newy, $black);
  77. imagesetpixel($new, $newx+1, $newy, $black);
  78. imagesetpixel($new, $newx-1, $newy, $black);
  79.  
  80. $foundblack++;
  81. } else {
  82. if ( (aboveme($im, $x, $y) && belowme($im, $x, $y)) || ( diagonal($im, $x, $y)     )  ) {
  83. imagesetpixel($new, $newx, $newy, $black);
  84. $foundblack++;
  85. } else {
  86. imagesetpixel($new, $newx, $newy, $white);
  87. }
  88. }
  89. $y++; $newy++;
  90. }
  91. if (($foundblack > 0) && (!$start)) $start = true;
  92. if ( ($foundblack < 2) && ($start) && ($hitfound < 6) ) {
  93. $hitfound++;
  94. }
  95. $x++; $newx++;
  96. }
  97. imagejpeg($new, ‘new.jpg’, 100);
  98. }
  99. function aboveme($im, $x, $y) {
  100. return isBlack($im, $x, $y-1);
  101. }
  102. function diagonal($im, $x, $y) {
  103. return (isBlack($im, $x+1, $y-1) && isBlack($im, $x-1, $y+1) && isBlack($im, $x+2, $y-2));
  104. }
  105.  
  106. function belowme($im, $x, $y) {
  107. return isBlack($im, $x, $y+1);
  108. }
  109. function doublebelowme($im, $x, $y) {
  110. return isBlack($im, $x, $y+2);
  111. }
  112. function leftme($im, $x, $y) {
  113. return isBlack($im, $x-1, $y);
  114. }
  115. function rightme($im, $x, $y) {
  116. return isBlack($im, $x+1, $y);
  117. }
  118. function isBlack($im, $x, $y) {
  119. return (16777215 - imagecolorat($im, $x, $y)) > 2211142;
  120. }

This code is using php’s GD library to perform some operations. It will scan the image and if it finds a spot with less than 2 black pixels it means there’s a space between the 2 numbers. We put extra space between the numbers and make them bolder.

bold image

The final step is to fetch the result from gocr.

Ofcourse you first need to train gocr, you can do this by:

gocr -p data/ -m 256 -m 2 -a 25 new.jpg

This will train gocr. If you feel gocr has learned enough, you can request the captcha result by doing:

  1. exec("/usr/bin/gocr -p data/ -m 256 -m 130 new.jpg", $a);
  2. print_r($a); //print the captcha answer

$a will contain the answer, simply post that to the sign up page with fsockopen and you’re in! Massive accounts in no time.


There Is 1 Response So Far. »


  1. MedicamentSpot.com. Canadian Health&Care.Special Internet Prices.Best quality drugs.No prescription online pharmacy. High quality drugs. Buy pills online

    Buy:Zetia.Female Pink Viagra.Ventolin.Buspar.Lipothin.Wellbutrin SR.Zocor.Lipitor.Female Cialis.Cozaar.SleepWell.Lasix.Amoxicillin.Benicar.Seroquel.Advair.Nymphomax.Prozac.Acomplia.Aricept….

Post a Comment