In previous posts, we developed an application which uses IBM Tone Analyzer to analyze given text and return tone scoring per sentence, then we improved it into an app which evaluates the content of a web page.

In this post we will focus on security and logging. They are often considered irrelevant when it comes to proof-of-concept building to return like a boomerang when building a production-ready application. The framework we used for developing the web app provides some general security options that can easily be implemented. Addressing them from the beginning will help us prevent later security issues.

Always make secure connections

Most web services allow connecting via an insecure (http://) or a secure (https://) channel. Modern web browser prefers or even mandate the latter. Just to make sure we can add a little bit of code to redirect the user using the insecure channel to hop over. We extend the framework to redirect to the same URL replacing ‘http’ bij ‘https’. We do this by creating an in-line middleware routine in our main application

our $app = builder {
   enable 'Plack::Middleware::Static', path => qr{.*\.jpg|.*\.png}, root => '.';
 
   enable sub {
      my $app = shift;
      sub {
         my $env = shift;
         $env->{req} = Plack::Request->new($env);
 
         # redirect to https !
         if($env->{HTTP_X_FORWARDED_PROTO} eq 'http') {
            my $uri = $env->{req}->uri;
            $uri->scheme('https');
            return [ 301, ['Location' => $uri ], []];
         }
 
         my $rc =  $app->($env);
         return $rc;
      }
   };
 
   mount "/urltone" => \&urltone;
   mount "/tone" => \&tone;
   mount "/" => \&main;
}

Our web application is not running on an encrypted layer. The Cloud Foundry environment handles all requests, either securely connected or not and hands them over to our local app over a standard unencrypted channel. To inform the application if the request came in over a secure channel or not, the front facing infrastructure sets an environment variable. If the variable contains ‘http’ we just rewrite the scheme of the request to use ‘https’ and return a redirect message of type 301 “Moved Permanently”. That way any request will be turned into its secure equivalent and the end-user will be redirected instantly to the same resource using an encrypted channel.

As an additional bonus, since we are parsing all environment variables in this routine, we can eliminate the “$env→{req}” parsing lines in our subroutines. No need to do this twice as the parsed results are passed to the subroutines via our $env variable.

In case the request comes in over a secure channel, the remainder of the processing will be executed and returned as a result “my $rc = $app->($env); return $rc;”

Add additional headers

To further improve your application, additional headers can be set. In particular, if the application is using cookies, as most production apps do, it is good practice to send the cookies only over secure channels. We suggest to at least add the following headers for security reasons

my $rc =  $app->($env);
 
push(@{$rc->[1]}, 'Strict-Transport-Security' => 'max-age=18000000; includeSubdomains');
push(@{$rc->[1]}, 'X-Frame-Options' => 'SAMEORIGIN');
 
return $rc;

Adding Strict-Transport-Security or HSTS will force the web browser to only use encrypted connections. The X-Frame-Option is intended to protect against various external attacks such as “clickjacking” and “framebusting”. We will stop here noting that there are many more specific options out of scope to this blog article.

Multi-threading

On a different level, the default “simple” Plack framework is a single-threaded application. That implies that although a queue is used to handle all request, they will be handled sequentially. If a particular request requires more than a few milliseconds, the next request will have to wait. This can partially be solved by bumping the number of times the web application is started. But this obviously also comes at a cost. To improve the responsiveness, we will set up all Cloud Foundry instances to be multi-threaded too.

To keep all of this simple and efficient we pick “Starlet – a simple, high-performance PSGI/Plack HTTP server”. SourcyBuild.sh now looks like this:

# if you want to see what happens in more detail

SOURCEY_VERBOSE=1
 
# if you want to force sourcey to rebuild everything
SOURCEY_REBUILD=0
 
# create a copy of perl
#buildPerl 5.22.1
 
buildPerlModule PSGI
buildPerlModule Plack
buildPerlModule Plack::Runner
buildPerlModule Plack::Request
buildPerlModule Plack::Response
buildPerlModule Plack::Builder
buildPerlModule Plack::Middleware::Static
buildPerlModule Data::Dumper
buildPerlModule HTML::Template
buildPerlModule JSON
buildPerlModule Starlet
buildPerlModule LWP::UserAgent
buildPerlModule LWP::Protocol::https
buildPerlModule HTML::Extract
buildPerlModule Starlet

In SourceyStart.sh we start the Plack framework with our selected server framework:

#!/bin/sh
 
echo "starting plackup"
export PLACK_ENV=deployment
PERL5LIB="/home/vcap/app/sourcey/lib/perl:/home/vcap/app/sourcey/lib/perl5" ./app.pl -s Starlet --port $PORT

One of the reasons Starlet is so popular is the minimal setup and tuning required. That said, for specific details and additional options we refer you to the Starlet manual page.

At this point, we have focused on network and connection security and increased the overall stability and resilience of the application well above the scope of a simple proof-of-concept.

Application security

Let’s focus now on the application itself. Whenever someone can enter the text that will be visualized as is, there is a potential risk of what is known as HTML-injection.

It’s obvious that the text that is entered contains some HTML code and when visualized by the web browser will be formatted accordingly. Malicious parties can abuse this phenomenon to trick people to do stuff that they did not intend, simply because they are unaware of the risk.

A solution to this could be to always sanitize the input. For that, we prefer using our framework. The current routine creates a blob of HTML code that we insert in the template using escape=”none” to take the content as is. Converting the routine to produce a list of sentences with no HTML and allowing the framework to secure the content will fix our issues.

The new beautify2 routine will create lists with the appropriate content and the template will visualize the data in the appropriate HTML format.

sub beautify2 {

   my ($text, $tone, $env) = @_;
   $env->{param}->{overall} = $tone->{document_tone}->{tones};
   my @lines = ();
   for my $sentence (@{$tone->{sentences_tone}}) {
      my ($class, $legend) = ('', '');
      for my $tone (@{$sentence->{tones}}) {
          $class .= " $tone->{tone_id} $tone{$tone->{tone_id}}";
          $legend .= " $tone->{tone_name}: $tone->{score}\n";
      }
      $text =~ s/^\s*(.*?)\s*($sentence->{text})//m;
      push(@lines, { text => $1, class => '', legend => '' }) if($1);
      push(@lines, { text => $2, class => $class, legend => $legend });
   }
   push(@lines, { text => $text, class => '', legend => '' }) if($text);
   $env->{param}->{lines} = \@lines;
}
sub tone {
   my $env = shift;
   my $text = $env->{req}->param('text');
   if($env->{param}->{text} = $text) {
      my $result = $ua->post($api_url, 'Content-Type' => 'application/json', Content => encode_json({text => $env->{param}->{text}}));
      if($result->is_success) {
         beautify2($text, decode_json $result->decoded_content, $env);
         } else {
         Page::error($env, $result->status_line);
         }
      }
   return Page::content($env);
 }

Obviously, we are separating the content from the formatting. So, we also have to replace the tone.tmpl file:

<TMPL_INCLUDE name="header.tmpl">
<script>$(function () { $('[data-toggle="tooltip"]').tooltip() })</script>

<!-- Page Content -->
<div class="container">
   <header class="jumbotron my-4">
      <img src="style/nxp2.png" class="float-right w-25">
      <h1 class="display-5">Tone assistant</h1>
   </header>

<TMPL_IF name="alert">
<TMPL_LOOP name="alert">
   <div class="container alert alert-<TMPL_VAR NAME='type'>" role="alert"><pre><TMPL_VAR name="msg"></pre></div>
</TMPL_LOOP>
</TMPL_IF>

<div class="my-2">
   <div class="border border-primary rounded p-3">
      <TMPL_IF name="overall">
         <table class="table border shadow bg-white">
            <thead><tr><th scope="col">Overall tone</th><th scope="col">Score</th></tr></thead>
            <tbody>
               <TMPL_LOOP name="overall">
                  <tr><td><TMPL_VAR name="tone_name"><!-- <TMPL_VAR name="tone_id"> --></td><td><TMPL_VAR name="score"></td></tr>
               </TMPL_LOOP>
            </tbody>
         </table>
      </TMPL_IF>
      <TMPL_LOOP name="lines">
         <div class="<TMPL_VAR name='class'>" data-toggle="tooltip" title="<TMPL_VAR name='legend'>"><TMPL_VAR name='text'><\/div>
      </TMPL_LOOP>
      </div>
   </div>

   <div class="my-2">
      <form>
         <div class="form-group">
            <label for="textinput">Provide your text</label>
            <textarea class="form-control" id="textinput" rows="5" name="text"><TMPL_VAR name="text"></textarea>
         </div>
         <input class="btn btn-primary" type="submit" value="Analyze tone">
      </form>
    </div>
</div>
<!-- /.container -->
<TMPL_INCLUDE name="footer.tmpl">

The ‘escape’ has disappeared from all our “<TMPL_VAR>” entries. The content will be visualized exactly the way it was entered.

Now that we have a secure app, secure network, and secure connections, let’s add logging.

Logging and debugging

To be able to debug your code it is nice to have the Data::Dumper library available. Using this library will allow you to visualize the full content structure of a variable anytime.

Just for reference, if you would dump the $env variable just before returning the content of a page it would look something like this:

$VAR1 = {
   'SERVER_PROTOCOL' => 'HTTP/1.1',
   'PATH_INFO' => '',
   'HTTP_ACCEPT_LANGUAGE' => 'en-US,en;q=0.5',
   'HTTP_UPGRADE_INSECURE_REQUESTS' => '1',
   'start' => 1602856388,
   'HTTP_DNT' => '1',
   'psgi.input' => \*{'HTTP::Server::PSGI::$input'},
   'SCRIPT_NAME' => '/tone',
   'plack.request.body_parameters' => [],
   'REMOTE_PORT' => 35996,
   'psgi.streaming' => 1,
   'psgi.url_scheme' => 'http',
   'req' => bless( {
      'env' => $VAR1
   }, 'Plack::Request' ),
   'HTTP_USER_AGENT' => 'Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0',
   'HTTP_ACCEPT' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
   'HTTP_ACCEPT_ENCODING' => 'gzip, deflate',
   'param' => {
      'text' => 'For over 25 years Nexperteam proves security solutions in IT infrastructure, cloud technology and DNS, to customers worldwide, for companies of various sizes and from different business sectors. App migration, infrastructure migration, on-premise to one or more cloud solutions or cloud to cloud migration are at the core of our offering. Our team accelerate your business through cloud technology focussing on cost optimization improving availability, reliability and performance of your data centre, network and security infrastructure. We offer strategic plans to align the information technology infrastructure to your business strategy providing a wide range of services in the cloud. We deliver high-quality end-to-end web development services, help in finding ways to overcome the challenges with secure and scalable web apps and solutions, build web applications and solutions that are perfectly tailored to your business needs.
      ',
     'lines' => [
         {
            'text' => 'For over 25 years Nexperteam proves security solutions in IT infrastructure, cloud technology and DNS, to customers worldwide, for companies of various sizes and from different business sectors.',
            'class' => ' analytical text-info',
            'legend' => ' Analytical: 0.560098
            '
         },
         {
            'text' => 'App migration, infrastructure migration, on-premise to one or more cloud solutions or cloud to cloud migration are at the core of our offering.',
            'legend' => ' Tentative: 0.769251
            ',
            'class' => ' tentative text-primary'
…

You can clearly see the ‘param’ structure used to fill the data fields in the HTML template next to a multitude of other parameters set up by the framework and the Cloud Foundry environment. Both standard output channel and the error channel are captured by the Cloud Foundry framework and can be directly accessed. You can either retrieve the last lines from the output using the ‘–recent’ option or follow the current output as the web access scrolls by:

$ bx cf logs halloworld --recent
Invoking 'cf logs halloworld --recent'...

Retrieving logs for app halloworld in org Nexperteam / space digitalinvite as jan@emailsdress...

2020-10-16T16:43:58.92+0200 [APP/PROC/WEB/0] OUT starting plackup
2020-10-16T16:44:00.64+0200 [CELL/0] OUT Container became healthy
2020-10-16T16:44:14.62+0200 [RTR/1] OUT halloworld.eu-de.mybluemix.net - [2020-10-16T14:44:14.517315379Z] "GET /tone HTTP/1.1" 200 0 3353 "https://halloworld.eu-de.mybluemix.net/" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0" "10.85.78.51:37187" "149.81.126.108:61224" x_forwarded_for:"141.135.70.51, 10.85.78.51" x_forwarded_proto:"https" vcap_request_id:"1ba56dbb-dd12-4d72-6328-d16144a9d12c" response_time:0.107803 gorouter_time:0.000383 app_id:"f19a2a37-0f7f-4b20-9b03-99e7df63701b" app_index:"0" x_cf_routererror:"-" x_global_transaction_id:"48cd8a595f89b1beb22b5e4f" true_client_ip:"-" x_b3_traceid:"ccc851e6c936f097" x_b3_spanid:"ccc851e6c936f097" x_b3_parentspanid:"-" b3:"ccc851e6c936f097-ccc851e6c936f097"
2020-10-16T16:44:14.62+0200 [RTR/1] OUT
2020-10-16T16:44:21.54+0200 [RTR/3] OUT halloworld.eu-de.mybluemix.net - [2020-10-16T14:44:20.665121505Z] "GET /tone?text=For+over+25+years+Nexperteam+proves+security+solutions+in+IT+infrastructure%2C+cloud+technology+and+DNS%2C+to+customers+worldwide%2C+for+companies+of+various+sizes+and+from+different+business+sectors.+App+migration%2C+infrastructure+migration%2C+on-premise+to+one+or+more+cloud+solutions+or+cloud+to+cloud+migration+are+at+the+core+of+our+offering.+Our+team+accelerate+your+business+through+cloud+technology+focussing+on+cost+optimization+improving+availability%2C+reliability+and+performance+of+your+data+centre%2C+network+and+security+infrastructure.+We+offer+strategic+plans+to+align+the+information+technology+infrastructure+to+your+business+strategy+providing+a+wide+range+of+services+in+the+cloud.+We+deliver+high-quality+end-to-end+web+development+services%2C+help+in+finding+ways+to+overcome+the+challenges+with+secure+and+scalable+web+apps+and+solutions%2C+build+web+applications+and+solutions+that+are+perfectly+tailored+to+your+business+needs.%0D%0A HTTP/1.1" 200 0 3354 "https://halloworld.eu-de.mybluemix.net/tone" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0" "10.85.78.51:52613" "149.81.126.108:61224" x_forwarded_for:"141.135.70.51, 10.85.78.51" x_forwarded_proto:"https" vcap_request_id:"124516f1-cf94-42ca-57b4-6d691b6483a9" response_time:0.882782 gorouter_time:0.000390 app_id:"f19a2a37-0f7f-4b20-9b03-99e7df63701b" app_index:"0" x_cf_routererror:"-" x_global_transaction_id:"48cd8a595f89b1c43aca0457" true_client_ip:"-" x_b3_traceid:"00b80b112e5847bd" x_b3_spanid:"00b80b112e5847bd" x_b3_parentspanid:"-" b3:"00b80b112e5847bd-00b80b112e5847bd"
2020-10-16T16:44:21.54+0200 [RTR/3] OUT

None of the above commands help to build up history. But as Cloud Foundry integrates with many Log Analysis tools, it is as easy as ordering the additional service and linking it up to your application.

Cloud providers offer you with a variety of security solutions, but it’s up to you to decide which one you’ll use. In extend to these security solutions, you need to further implement additional security measures to ensure that your web app is securely running in the cloud, even if we’re talking about just a proof-of-concept web app.