Home >php教程 >PHP源码 >PHP防止XSS攻击之过滤、验证和转义之转义的例子

PHP防止XSS攻击之过滤、验证和转义之转义的例子

WBOY
WBOYOriginal
2016-06-12 15:04:271592browse

本文章来为各位介绍一篇关于PHP防止XSS攻击之过滤、验证和转义之转义的例子,希望这篇教程能够帮助到各位朋友。

<script>ec(2);</script>

PHP 转义实现

把输出渲染成网页或API响应时,一定要转义输出,这也是一种防护措施,能避免渲染恶意代码,造成XSS攻击,还能防止应用的用户无意中执行恶意代码。

我们可以使用前面提到的htmlentities函数转移输出,该函数的第二个参数一定要使用ENT_QUOTES,让这个函数转义单引号和双引号,而且,还要在第三个参数中指定合适的字符编码(通常是UTF-8),下面的例子演示了如何在渲染前转义HTML输出:

$output = '

<script>alert(&ldquo;欢迎来到Laravel学院!")</script>

';
echo htmlentities($output, ENT_QUOTES, ‘UTF-8');
如果不转义直接输出,会弹出提示框:

alert

转义之后输出变成:

<script>alert("欢迎访问Laravel学院!");</script>


现代PHP支持许多模板引擎,这些模板引擎在底层已经为了做好了转义处理,比如现在流行的twig/twig和smarty/smarty都会自动转义输出。这种默认处理方式很赞,为PHP Web应用提供了有力的安全保障。

Blade 模板引擎避免XSS攻击原理
Laravel使用的模板引擎是Blade,关于Blade的使用可以参考其官方文档,这里我们简单探讨下Laravel底层如何对输出进行转义处理。

一般我们在Laravel中返回视图内容会这么做:

return view(’test’, [‘data’=>$data]);
这是一个很简单的例子,意味着我们会在resources/views目录下找到test.blade.php视图文件,然后将$data变量传入其中,并将最终渲染结果作为响应的内容返回给用户。那么这一过程经历了哪些底层源码的处理,如果$data变量中包含脚本代码(如JavaScript脚本),又该怎么去处理呢?接下来我们让来一窥究竟。

首先我们从辅助函数view入手,当然这里我们也可以使用View:make,但是简单起见,我们一般用view函数,该函数定义在Illuminate\Foundation\helpers.php文件中:

function view($view = null, $data = [], $mergeData = [])
{
    $factory = app(ViewFactory::class);
    if (func_num_args() === 0) {
        return $factory;
    }

    return $factory->make($view, $data, $mergeData);
}
该函数中的逻辑是从容器中取出视图工厂接口ViewFactory对应的实例$factory(该绑定关系在Illuminate\View\ViewServiceProvider的register方法中注册,此外这里还注册了模板引擎解析器EngineResolver,包括PhpEngine和载入BladeCompiler的CompilerEngine,以及视图文件查找器FileViewFinder,一句话,这里注册了视图解析所需的所有服务),如果传入了参数,则调用$factory上的make方法:

public function make($view, $data = [], $mergeData = [])
{
    if (isset($this->aliases[$view])) {
        $view = $this->aliases[$view];
    }

    $view = $this->normalizeName($view);

    $path = $this->finder->find($view);

    $data = array_merge($mergeData, $this->parseData($data));

    $this->callCreator($view = new View($this, $this->getEngineFromPath($path), $view, $path, $data));

    return $view;
}
这个方法位于Illuminate\View\Factory,这里所做的事情是获取视图文件的完整路径,合并传入变量,$this->getEngineFromPath会通过视图文件后缀获取相应的模板引擎,比如我们使用.blade.php结尾的视图文件则获得到的是CompilerEngine(即Blade模板引擎),否则将获取到PhpEngine,然后我们根据相应参数实例化View(Illuminate\View\View)对象并返回。需要注意的是View类中重写了__toString方法:

public function __toString()
{
    return $this->render();
}
所以当我们打印$view实例的时候,实际上会调用View类的render方法,所以下一步我们理所应当研究render方法做了些什么:

public function render(callable $callback = null)
{
    try {
        $contents = $this->renderContents();
        $response = isset($callback) ? call_user_func($callback, $this, $contents) : null;

        // Once we have the contents of the view, we will flush the sections if we are
        // done rendering all views so that there is nothing left hanging over when
        // another view gets rendered in the future by the application developer.
        $this->factory->flushSectionsIfDoneRendering();

        return ! is_null($response) ? $response : $contents;
    } catch (Exception $e) {
        $this->factory->flushSections();

        throw $e;
    } catch (Throwable $e) {
        $this->factory->flushSections();
 
        throw $e;
    }
}
这里重点是$this->renderContents()方法,我们继续深入研究View类中的renderContents方法:

protected function renderContents()
{
    // We will keep track of the amount of views being rendered so we can flush
    // the section after the complete rendering operation is done. This will
    // clear out the sections for any separate views that may be rendered.
    $this->factory->incrementRender();
    $this->factory->callComposer($this);

    $contents = $this->getContents();

    // Once we've finished rendering the view, we'll decrement the render count
    // so that each sections get flushed out next time a view is created and
    // no old sections are staying around in the memory of an environment.
    $this->factory->decrementRender();

    return $contents;
}
我们重点关注$this->getContents()这里,进入getContents方法:

protected function getContents()
{
    return $this->engine->get($this->path, $this->gatherData());
}
我们在前面已经提到,这里的$this->engine对应CompilerEngine(Illuminate\View\Engines\CompilerEngine),所以我们进入CompilerEngine的get方法:

public function get($path, array $data = [])
{
    $this->lastCompiled[] = $path;
    // If this given view has expired, which means it has simply been edited since
    // it was last compiled, we will re-compile the views so we can evaluate a
    // fresh copy of the view. We'll pass the compiler the path of the view.
    if ($this->compiler->isExpired($path)) {
        $this->compiler->compile($path);
    }

    $compiled = $this->compiler->getCompiledPath($path);

    // Once we have the path to the compiled file, we will evaluate the paths with
    // typical PHP just like any other templates. We also keep a stack of views
    // which have been rendered for right exception messages to be generated.
    $results = $this->evaluatePath($compiled, $data);

    array_pop($this->lastCompiled);

    return $results;
}
同样我们在之前提到,CompilerEngine使用的compiler是BladeCompiler,所以$this->compiler也就是Blade编译器,我们先看$this->compiler->compile($path);这一行(首次运行或者编译好的视图模板已过期会进这里),进入BladeCompiler的compile方法:

public function compile($path = null)
{
    if ($path) {
        $this->setPath($path);
    }
    if (! is_null($this->cachePath)) {
        $contents = $this->compileString($this->files->get($this->getPath()));

        $this->files->put($this->getCompiledPath($this->getPath()), $contents);
    }
}
这里我们做的事情是先编译视图文件内容,然后将编译好的内容存放到视图编译路径(storage\framework\views)下对应的文件(一次编译,多次运行,以提高性能),这里我们重点关注的是$this->compileString方法,该方法中使用了token_get_all函数将视图文件代码分割成多个片段,如果片段是数组的话则循环调用$this->parseToken方法:

protected function parseToken($token)
{
    list($id, $content) = $token;
    if ($id == T_INLINE_HTML) {
        foreach ($this->compilers as $type) {
            $content = $this->{"compile{$type}"}($content);
        }
    }

    return $content;
}
来到这里,我们已经很接近真相了,针对HTML代码(含Blade指令代码),循环调用compileExtensions、compileStatements、compileComments和compileEchos方法,我们重点关注输出方法compileEchos,Blade引擎默认提供了compileRawEchos、compileEscapedEchos和compileRegularEchos三种输出方法,对应的指令分别是{!! !!}、{{{ }}}和{{ }},顾名思义,compileRawEchos对应的是原生输出:

protected function compileRawEchos($value)
{
    $pattern = sprintf('/(@)?%s\s*(.+?)\s*%s(\r?\n)?/s', $this->rawTags[0], $this->rawTags[1]);
    $callback = function ($matches) {
        $whitespace = empty($matches[3]) ? '' : $matches[3].$matches[3];

        return $matches[1] ? substr($matches[0], 1) : 'compileEchoDefaults($matches[2]).'; ?>'.$whitespace;
    };

    return preg_replace_callback($pattern, $callback, $value);
}
即Blade视图中以{!! !!}包裹的变量会原生输出HTML,如果要显示图片、链接,推荐这种方式。

{{{}}}对应的CompileEscapedEchos,这个在Laravel 4.2及以前版本中用于转义,现在已经替换成了{{}},即调用compileRegularEchos方法:

protected function compileRegularEchos($value)
{
    $pattern = sprintf('/(@)?%s\s*(.+?)\s*%s(\r?\n)?/s', $this->contentTags[0], $this->contentTags[1]);
    $callback = function ($matches) {
        $whitespace = empty($matches[3]) ? '' : $matches[3].$matches[3];

        $wrapped = sprintf($this->echoFormat, $this->compileEchoDefaults($matches[2]));

        return $matches[1] ? substr($matches[0], 1) : ''.$whitespace;
    };

    return preg_replace_callback($pattern, $callback, $value);
}
其中$this->echoFormat对应e(%s),无独有偶,compileEscapedEchos中也用到这个方法:

protected function compileEscapedEchos($value)
{
    $pattern = sprintf('/(@)?%s\s*(.+?)\s*%s(\r?\n)?/s', $this->escapedTags[0], $this->escapedTags[1]);
    $callback = function ($matches) {
        $whitespace = empty($matches[3]) ? '' : $matches[3].$matches[3];

        return $matches[1] ? $matches[0] : 'compileEchoDefaults($matches[2]).'); ?>'.$whitespace;
    };

    return preg_replace_callback($pattern, $callback, $value);

}
辅助函数e()定义在Illuminate\Support\helpers.php中:

function e($value)
{
    if ($value instanceof Htmlable) {
        return $value->toHtml();
    }
    return htmlentities($value, ENT_QUOTES, 'UTF-8', false);
}
其作用就是对输入的值进行转义。

经过这样的转义,视图中的{{ $data }}或被编译成,最终如何将$data传入视图输出,我们再回到CompilerEngine的get方法,看这一段:

$results = $this->evaluatePath($compiled, $data);
evaluatePath中传入了编译后的视图文件路径和传入的变量$data,该方法定义如下:

protected function evaluatePath($__path, $__data)
{
   $obLevel = ob_get_level();ob_start();

    extract($__data, EXTR_SKIP);

    // We'll evaluate the contents of the view inside a try/catch block so we can
    // flush out any stray output that might get out before an error occurs or
    // an exception is thrown. This prevents any partial views from leaking.
    try {
        include $__path;
    } catch (Exception $e) {
        $this->handleViewException($e, $obLevel);
    } catch (Throwable $e) {
        $this->handleViewException(new FatalThrowableError($e), $obLevel);
    }

    return ltrim(ob_get_clean());
}
这里面调用了PHP系统函数extract将传入变量从数组中导入当前符号表(通过include $__path引入),其作用也就是将编译后视图文件中的变量悉数替换成传入的变量值(通过键名映射)。

好了,这就是Blade视图模板从渲染到输出的基本过程,可以看到我们通过{{}}来转义输出,从而达到避免XSS攻击的目的。

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn