給你一個(gè)網(wǎng)站你怎么做的嗎網(wǎng)盤網(wǎng)頁版
前言
在測試過程中,有的時(shí)候登錄需要輸入圖片驗(yàn)證碼。這時(shí)候使用Selenium進(jìn)行自動(dòng)化測試,怎么做圖片驗(yàn)證碼識(shí)別?本篇內(nèi)容主要介紹使用Selenium、BufferedImage、Tesseract進(jìn)行圖片 驗(yàn)證碼識(shí)別。
環(huán)境準(zhǔn)備
jdk:1.8
tessdata:文章末尾附下載地址
安裝Tesseract
我本地是ubuntu系統(tǒng)
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
在項(xiàng)目中引用
<dependency><groupId>net.sourceforge.tess4j</groupId><artifactId>tess4j</artifactId><version>4.5.4</version>
</dependency>
實(shí)現(xiàn)
在下圖中,登錄需要使用圖片驗(yàn)證碼進(jìn)行驗(yàn)證。我們的圖片驗(yàn)證碼識(shí)別流程是使用Selenium定位到圖片驗(yàn)證碼元素,將元素截圖保。然后將保存的圖片驗(yàn)證碼使用BufferedImage進(jìn)行灰度化、二值化處理,處理完成后去除圖片上的干擾點(diǎn)。最后使用Tesseract進(jìn)行圖片驗(yàn)證碼上的字符識(shí)別。
處理圖片
首先使用BufferedImage讀取圖片驗(yàn)證碼圖片,然后調(diào)整亮度后進(jìn)行灰度化、二值化處理。處理后的圖片去除干擾點(diǎn)。
public static void cleanLinesInImage(File sfile, String destDir) throws IOException{File destF =new File(destDir);if (!destF.exists()){destF.mkdirs();}BufferedImage bufferedImage = ImageIO.read(sfile);int h = bufferedImage.getHeight();int w = bufferedImage.getWidth();// 灰度化int[][] gray = new int[w][h];for (int x = 0; x < w; x++){for (int y = 0; y < h; y++){int argb = bufferedImage.getRGB(x, y);// 圖像加亮(調(diào)整亮度識(shí)別率非常高)int r = (int) (((argb >> 16) & 0xFF) * 1.1 + 30);int g = (int) (((argb >> 8) & 0xFF) * 1.1 + 30);int b = (int) (((argb >> 0) & 0xFF) * 1.1 + 30);// int r = (int) (((argb >> 16) & 0xFF) * 0.1 + 30);// int g = (int) (((argb >> 8) & 0xFF) * 0.1 + 30);// int b = (int) (((argb >> 0) & 0xFF) * 0.1 + 30);if (r >= 255){r = 255;}if (g >= 255){g = 255;}if (b >= 255){b = 255;}gray[x][y] = (int) Math.pow((Math.pow(r, 2.2) * 0.2973 + Math.pow(g, 2.2)* 0.6274 + Math.pow(b, 2.2) * 0.0753), 1 / 2.2);}}ImageIO.write(bufferedImage, "jpg", new File(destDir, sfile.getName()));// 二值化int threshold = ostu(gray, w, h);BufferedImage binaryBufferedImage = new BufferedImage(w, h, BufferedImage.TYPE_BYTE_BINARY);for (int x = 0; x < w; x++){for (int y = 0; y < h; y++){if (gray[x][y] > threshold){gray[x][y] |= 0x00FFFF;} else{gray[x][y] &= 0xFF0000;}binaryBufferedImage.setRGB(x, y, gray[x][y]);}}ImageIO.write(binaryBufferedImage, "jpg", new File(destDir, sfile.getName()));// 去除干擾線條for(int y = 1; y < h-1; y++){for(int x = 1; x < w-1; x++){boolean flag = false ;if(isBlack(binaryBufferedImage.getRGB(x, y))){//左右均為空時(shí),去掉此點(diǎn)if(isWhite(binaryBufferedImage.getRGB(x-1, y)) && isWhite(binaryBufferedImage.getRGB(x+1, y))){flag = true;}//上下均為空時(shí),去掉此點(diǎn)if(isWhite(binaryBufferedImage.getRGB(x, y+1)) && isWhite(binaryBufferedImage.getRGB(x, y-1))){flag = true;}//斜上下為空時(shí),去掉此點(diǎn)if(isWhite(binaryBufferedImage.getRGB(x-1, y+1)) && isWhite(binaryBufferedImage.getRGB(x+1, y-1))){flag = true;}if(isWhite(binaryBufferedImage.getRGB(x+1, y+1)) && isWhite(binaryBufferedImage.getRGB(x-1, y-1))){flag = true;}if(flag){binaryBufferedImage.setRGB(x,y,-1);}}}}// 矩陣打印// for (int y = 0; y < h; y++)// {// for (int x = 0; x < w; x++)// {// if (isBlack(binaryBufferedImage.getRGB(x, y)))// {// System.out.print("*");// } else// {// System.out.print(" ");// }// }// System.out.println();// }ImageIO.write(binaryBufferedImage, "jpg", new File(destDir, sfile.getName()));}
OCR識(shí)別
setDataPath方法,傳入你下載的
public static String executeTess4J(String imgUrl){String ocrResult = "";try{ITesseract instance = new Tesseract();instance.setDatapath("your tessdata path");instance.setLanguage("eng");instance.setOcrEngineMode(0);instance.setTessVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567890");File imgDir = new File(imgUrl);//long startTime = System.currentTimeMillis();ocrResult = instance.doOCR(imgDir);}catch (TesseractException e){e.printStackTrace();}return ocrResult;
}
驗(yàn)證
編寫Selenium腳本
public static void main(String[] args) throws IOException {System.setProperty("webdriver.chrome.driver", "/home/zhangkexin/chromedriver");WebDriver driver = new ChromeDriver();driver.manage().window().maximize();driver.manage().timeouts().implicitlyWait(20, TimeUnit.SECONDS);driver.get("https://xkczb.jtw.beijing.gov.cn/#");WebElement element = driver.findElement(By.xpath("//*[@id=\"getValidCode\"]/img"));File img = element.getScreenshotAs(OutputType.FILE);String path = System.getProperty("user.dir");cleanLinesInImage(img, path);String imgFile = path + "/" + img.getName();Path source = Paths.get(imgFile);Path dest = Paths.get("/home/zhangkexin/ui-test/autoTest/img.jpg");Files.copy(source, dest, StandardCopyOption.REPLACE_EXISTING);String code = executeTess4J("/home/zhangkexin/ui-test/autoTest/img.jpg");System.out.println(code);driver.quit();
}
看一下經(jīng)過處理后的圖片驗(yàn)證碼
最后實(shí)際識(shí)別出來的結(jié)果。
testdata:
鏈接:https://pan.baidu.com/s/1uJE9wl1oa2WAsBTsydUlmg?pwd=m576?
提取碼:m576