Web   ·   Wiki   ·   Activities   ·   Blog   ·   Lists   ·   Chat   ·   Meeting   ·   Bugs   ·   Git   ·   Translate   ·   Archive   ·   People   ·   Donate
summaryrefslogtreecommitdiffstats
path: root/studio/static/doc/flask-docs/unicode.html
blob: d5e480068bce68626703d7b4b195b44c4c3eab1b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
    <title>Unicode in Flask &mdash; Flask 0.8 documentation</title>
    
    <link rel="stylesheet" href="_static/flasky.css" type="text/css" />
    <link rel="stylesheet" href="_static/pygments.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '',
        VERSION:     '0.8',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="_static/jquery.js"></script>
    <script type="text/javascript" src="_static/underscore.js"></script>
    <script type="text/javascript" src="_static/doctools.js"></script>
    <link rel="top" title="Flask 0.8 documentation" href="index.html" />
    <link rel="next" title="Flask Extension Development" href="extensiondev.html" />
    <link rel="prev" title="Security Considerations" href="security.html" />
   
  
  <link rel="apple-touch-icon" href="_static/touch-icon.png" />
  
  <link media="only screen and (max-device-width: 480px)" href="_static/small_flask.css" type= "text/css" rel="stylesheet" />

  </head>
  <body>
    <div class="related">
      <h3>Navigation</h3>
      <ul>
        <li class="right" style="margin-right: 10px">
          <a href="genindex.html" title="General Index"
             accesskey="I">index</a></li>
        <li class="right" >
          <a href="extensiondev.html" title="Flask Extension Development"
             accesskey="N">next</a> |</li>
        <li class="right" >
          <a href="security.html" title="Security Considerations"
             accesskey="P">previous</a> |</li>
        <li><a href="index.html">Flask 0.8 documentation</a> &raquo;</li> 
      </ul>
    </div>  

    <div class="document">
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="unicode-in-flask">
<h1>Unicode in Flask<a class="headerlink" href="#unicode-in-flask" title="Permalink to this headline">¶</a></h1>
<p>Flask like Jinja2 and Werkzeug is totally Unicode based when it comes to
text.  Not only these libraries, also the majority of web related Python
libraries that deal with text.  If you don&#8217;t know Unicode so far, you
should probably read <a class="reference external" href="http://www.joelonsoftware.com/articles/Unicode.html">The Absolute Minimum Every Software Developer
Absolutely, Positively Must Know About Unicode and Character Sets</a>.  This part of the
documentation just tries to cover the very basics so that you have a
pleasant experience with Unicode related things.</p>
<div class="section" id="automatic-conversion">
<h2>Automatic Conversion<a class="headerlink" href="#automatic-conversion" title="Permalink to this headline">¶</a></h2>
<p>Flask has a few assumptions about your application (which you can change
of course) that give you basic and painless Unicode support:</p>
<ul class="simple">
<li>the encoding for text on your website is UTF-8</li>
<li>internally you will always use Unicode exclusively for text except
for literal strings with only ASCII character points.</li>
<li>encoding and decoding happens whenever you are talking over a protocol
that requires bytes to be transmitted.</li>
</ul>
<p>So what does this mean to you?</p>
<p>HTTP is based on bytes.  Not only the protocol, also the system used to
address documents on servers (so called URIs or URLs).  However HTML which
is usually transmitted on top of HTTP supports a large variety of
character sets and which ones are used, are transmitted in an HTTP header.
To not make this too complex Flask just assumes that if you are sending
Unicode out you want it to be UTF-8 encoded.  Flask will do the encoding
and setting of the appropriate headers for you.</p>
<p>The same is true if you are talking to databases with the help of
SQLAlchemy or a similar ORM system.  Some databases have a protocol that
already transmits Unicode and if they do not, SQLAlchemy or your other ORM
should take care of that.</p>
</div>
<div class="section" id="the-golden-rule">
<h2>The Golden Rule<a class="headerlink" href="#the-golden-rule" title="Permalink to this headline">¶</a></h2>
<p>So the rule of thumb: if you are not dealing with binary data, work with
Unicode.  What does working with Unicode in Python 2.x mean?</p>
<ul class="simple">
<li>as long as you are using ASCII charpoints only (basically numbers,
some special characters of latin letters without umlauts or anything
fancy) you can use regular string literals (<tt class="docutils literal"><span class="pre">'Hello</span> <span class="pre">World'</span></tt>).</li>
<li>if you need anything else than ASCII in a string you have to mark
this string as Unicode string by prefixing it with a lowercase <cite>u</cite>.
(like <tt class="docutils literal"><span class="pre">u'Hänsel</span> <span class="pre">und</span> <span class="pre">Gretel'</span></tt>)</li>
<li>if you are using non-Unicode characters in your Python files you have
to tell Python which encoding your file uses.  Again, I recommend
UTF-8 for this purpose.  To tell the interpreter your encoding you can
put the <tt class="docutils literal"><span class="pre">#</span> <span class="pre">-*-</span> <span class="pre">coding:</span> <span class="pre">utf-8</span> <span class="pre">-*-</span></tt> into the first or second line of
your Python source file.</li>
<li>Jinja is configured to decode the template files from UTF-8.  So make
sure to tell your editor to save the file as UTF-8 there as well.</li>
</ul>
</div>
<div class="section" id="encoding-and-decoding-yourself">
<h2>Encoding and Decoding Yourself<a class="headerlink" href="#encoding-and-decoding-yourself" title="Permalink to this headline">¶</a></h2>
<p>If you are talking with a filesystem or something that is not really based
on Unicode you will have to ensure that you decode properly when working
with Unicode interface.  So for example if you want to load a file on the
filesystem and embed it into a Jinja2 template you will have to decode it
from the encoding of that file.  Here the old problem that text files do
not specify their encoding comes into play.  So do yourself a favour and
limit yourself to UTF-8 for text files as well.</p>
<p>Anyways.  To load such a file with Unicode you can use the built-in
<tt class="xref py py-meth docutils literal"><span class="pre">str.decode()</span></tt> method:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="k">def</span> <span class="nf">read_file</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">charset</span><span class="o">=</span><span class="s">&#39;utf-8&#39;</span><span class="p">):</span>
    <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">&#39;r&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span><span class="o">.</span><span class="n">decode</span><span class="p">(</span><span class="n">charset</span><span class="p">)</span>
</pre></div>
</div>
<p>To go from Unicode into a specific charset such as UTF-8 you can use the
<tt class="xref py py-meth docutils literal"><span class="pre">unicode.encode()</span></tt> method:</p>
<div class="highlight-python"><div class="highlight"><pre><span class="k">def</span> <span class="nf">write_file</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">contents</span><span class="p">,</span> <span class="n">charset</span><span class="o">=</span><span class="s">&#39;utf-8&#39;</span><span class="p">):</span>
    <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">&#39;w&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">contents</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="n">charset</span><span class="p">))</span>
</pre></div>
</div>
</div>
<div class="section" id="configuring-editors">
<h2>Configuring Editors<a class="headerlink" href="#configuring-editors" title="Permalink to this headline">¶</a></h2>
<p>Most editors save as UTF-8 by default nowadays but in case your editor is
not configured to do this you have to change it.  Here some common ways to
set your editor to store as UTF-8:</p>
<ul>
<li><p class="first">Vim: put <tt class="docutils literal"><span class="pre">set</span> <span class="pre">enc=utf-8</span></tt> to your <tt class="docutils literal"><span class="pre">.vimrc</span></tt> file.</p>
</li>
<li><p class="first">Emacs: either use an encoding cookie or put this into your <tt class="docutils literal"><span class="pre">.emacs</span></tt>
file:</p>
<div class="highlight-python"><pre>(prefer-coding-system 'utf-8)
(setq default-buffer-file-coding-system 'utf-8)</pre>
</div>
</li>
<li><p class="first">Notepad++:</p>
<ol class="arabic simple">
<li>Go to <em>Settings -&gt; Preferences ...</em></li>
<li>Select the &#8220;New Document/Default Directory&#8221; tab</li>
<li>Select &#8220;UTF-8 without BOM&#8221; as encoding</li>
</ol>
<p>It is also recommended to use the Unix newline format, you can select
it in the same panel but this is not a requirement.</p>
</li>
</ul>
</div>
</div>


          </div>
        </div>
      </div>
      <div class="sphinxsidebar">
        <div class="sphinxsidebarwrapper"><p class="logo"><a href="index.html">
  <img class="logo" src="_static/flask.png" alt="Logo"/>
</a></p>
  <h3><a href="index.html">Table Of Contents</a></h3>
  <ul>
<li><a class="reference internal" href="#">Unicode in Flask</a><ul>
<li><a class="reference internal" href="#automatic-conversion">Automatic Conversion</a></li>
<li><a class="reference internal" href="#the-golden-rule">The Golden Rule</a></li>
<li><a class="reference internal" href="#encoding-and-decoding-yourself">Encoding and Decoding Yourself</a></li>
<li><a class="reference internal" href="#configuring-editors">Configuring Editors</a></li>
</ul>
</li>
</ul>
<h3>Related Topics</h3>
<ul>
  <li><a href="index.html">Documentation overview</a><ul>
      <li>Previous: <a href="security.html" title="previous chapter">Security Considerations</a></li>
      <li>Next: <a href="extensiondev.html" title="next chapter">Flask Extension Development</a></li>
  </ul></li>
</ul>
  <h3>This Page</h3>
  <ul class="this-page-menu">
    <li><a href="_sources/unicode.txt"
           rel="nofollow">Show Source</a></li>
  </ul>
<div id="searchbox" style="display: none">
  <h3>Quick search</h3>
    <form class="search" action="search.html" method="get">
      <input type="text" name="q" />
      <input type="submit" value="Go" />
      <input type="hidden" name="check_keywords" value="yes" />
      <input type="hidden" name="area" value="default" />
    </form>
    <p class="searchtip" style="font-size: 90%">
    Enter search terms or a module, class or function name.
    </p>
</div>
<script type="text/javascript">$('#searchbox').show(0);</script>
        </div>
      </div>
      <div class="clearer"></div>
    </div>
    <div class="footer">
      &copy; Copyright 2010, Armin Ronacher.
      Created using <a href="http://sphinx.pocoo.org/">Sphinx</a>.
    </div>
  </body>
</html>