PhantomJS ``plainText`` 属性:获取网页纯文本内容用法与示例


发布日期 : 2021-09-08 17:05:12 UTC

访问量: 10 次浏览

PhantomJS plaintext 属性

plainText 属性以纯文本形式返回网页内容,不包含任何HTML标签。

语法

其语法如下:

wpage.plainText

示例

让我们举一个示例来理解 plainText 属性的用法。

var wpage = require('webpage').create();
wpage.open('http://localhost/tasks/a.html', function (status) {
console.log(wpage.plainText);
phantom.exit();
});

a.html

<html>
<head></head>

<body name = "a">
<script type = "text/javascript">
console.log('welcome to cookie example');
document.cookie = "username = Roy; expires = Thu, 22 Dec 2017 12:00:00 UTC";

window.onload = function() {
console.log("page is loaded");
}
</script>

<h1>This is a test page</h1>
<h1>This is a test page</h1>
<h1>This is a test page</h1>
<h1>This is a test page</h1>
<h1>This is a test page</h1>
<h1>This is a test page</h1>
<h1>This is a test page</h1>
<h1>This is a test page</h1>
<h1>This is a test page</h1>
</body>

</html>

上述程序生成以下输出

This is a test page

This is a test page

This is a test page

This is a test page

This is a test page

This is a test page

This is a test page

This is a test page

This is a test page

plainText 属性只返回内容,不包含任何脚本标签或HTML标签。