Updates from tuxdna Toggle Comment Threads | Keyboard Shortcuts

  • tuxdna 8:50 pm on April 27, 2014 Permalink | Reply  

    I have moved to a new blog location http://tuxdna.in/blog/archives/ 

    I have moved to a new blog location http://tuxdna.in/blog/archives/

    • Johng269 6:31 pm on May 28, 2014 Permalink | Reply

      Hi, Neat post. There is a problem with your web site in internet explorer, would check this IE still is the market leader and a large portion of people will miss your magnificent writing because of this problem. ekcgbecddgcb

    • Karla 4:27 pm on July 26, 2014 Permalink | Reply

      Hello, every time i used to check blog posts here early in the morning, as i love to gain knowledge of more and more.

    • Major 11:07 am on July 27, 2014 Permalink | Reply

      Hey! This is kind of off topic but I need some advice from an established blog. Is it very difficult to set up your own blog? I’m not very techincal but I can figure things out pretty quick. I’m thinking about setting up my own but I’m not sure where to start. Do you have any ideas or suggestions? Thank you

    • Inez 11:19 am on July 27, 2014 Permalink | Reply

      Hey! This is kind of off topic but I need some guidance from an established blog. Is it tough to set up your own blog? I’m not very techincal but I can figure things out pretty quick. I’m thinking about making my own but I’m not sure where to start. Do you have any points or suggestions? Thank you

    • Myrtis 11:24 am on July 27, 2014 Permalink | Reply

      Right here is the perfect webpage for anyone who wants to understand this topic. You realize so much its almost hard to argue with you (not that I personally will need to…HaHa). You certainly put a brand new spin on a topic that’s been written about for many years. Wonderful stuff, just great!

    • Kelsey 11:28 am on July 27, 2014 Permalink | Reply

      It’s really a great and helpful piece of information. I’m glad that you shared this useful information with us. Please keep us up to date like this. Thanks for sharing.

    • Dotty 11:29 am on July 27, 2014 Permalink | Reply

      You really make it seem so easy with your presentation but I find this matter to be really something that I think I would never understand. It seems too complex and very broad for me. I am looking forward for your next post, I’ll try to get the hang of it!

    • Albertina 11:38 am on July 27, 2014 Permalink | Reply

      I was recommended this website by my cousin. I’m not sure whether this post is written by him as nobody else know such detailed about my problem. You’re incredible! Thanks!

    • Tilly 11:39 am on July 27, 2014 Permalink | Reply

      I was recommended this web site by my cousin. I am now not sure whether or not this put up is written by way of him as nobody else recognize such distinctive about my difficulty. You’re amazing! Thank you!

    • Ramonita 11:42 am on July 27, 2014 Permalink | Reply

      I read this article completely on the topic of the resemblance of latest and preceding technologies, it’s remarkable article.

    • Bradly 11:44 am on July 27, 2014 Permalink | Reply

      I think this is one of the so much important information for me. And i am satisfied studying your article. However should observation on some common things, The site taste is perfect, the articles is in point of fact great : D. Just right activity, cheers

    • Micheal 12:00 pm on July 27, 2014 Permalink | Reply

      First of all I would like to say terrific blog! I had a quick question that I’d like to ask if you do not mind. I was interested to know how you center yourself and clear your mind before writing. I have had trouble clearing my thoughts in getting my ideas out. I truly do take pleasure in writing but it just seems like the first 10 to 15 minutes are wasted just trying to figure out how to begin. Any recommendations or hints? Cheers!

    • Sophia 12:00 pm on July 27, 2014 Permalink | Reply

      Hey there, I think your site might be having browser compatibility issues. When I look at your website in Opera, it looks fine but when opening in Internet Explorer, it has some overlapping. I just wanted to give you a quick heads up! Other then that, excellent blog!

    • Mattie 12:02 pm on July 27, 2014 Permalink | Reply

      What’s up, I want to subscribe for this web site to get hottest updates, therefore where can i do it please assist.

    • Sunny 12:05 pm on July 27, 2014 Permalink | Reply

      Very shortly this web site will be famous among all blogging and site-building viewers, due to it’s fastidious content

    • Alisa 12:05 pm on July 27, 2014 Permalink | Reply

      Hello, this weekend is good in favor of me, as this moment i am reading this wonderful educational post here at my home.

    • Stewart 12:23 pm on July 27, 2014 Permalink | Reply

      Thanks , I’ve recently been looking for info about this topic for a while and yours is the best I’ve found out so far. However, what concerning the bottom line? Are you certain about the source?

    • Gabriela 12:30 pm on July 27, 2014 Permalink | Reply

      Thank you for the good writeup. It in truth used to be a entertainment account it. Look advanced to more delivered agreeable from you! By the way, how could we keep up a correspondence?

    • Archer 12:43 pm on July 27, 2014 Permalink | Reply

      Very nice post. I just stumbled upon your blog and wanted to say that I’ve really enjoyed browsing your blog posts. After all I’ll be subscribing to your feed and I hope you write again soon!

    • Georgina 12:44 pm on July 27, 2014 Permalink | Reply

      Thank you a bunch for sharing this with all folks you actually realize what you’re talking approximately! Bookmarked. Kindly additionally seek advice from my site =). We may have a link change contract between us

    • Fabian 12:52 pm on July 27, 2014 Permalink | Reply

      Greetings from Ohio! I’m bored to death at work so I decided to browse your blog on my iphone during lunch break. I really like the info you provide here and can’t wait to take a look when I get home. I’m amazed at how fast your blog loaded on my phone .. I’m not even using WIFI, just 3G .. Anyways, excellent site!

    • Latasha 1:04 pm on July 27, 2014 Permalink | Reply

      I do trust all of the ideas you’ve offered for your post. They’re very convincing and will definitely work. Nonetheless, the posts are very short for newbies. Could you please lengthen them a little from subsequent time? Thank you for the post.

    • Alejandra 1:06 pm on July 27, 2014 Permalink | Reply

      Hi! I’m at work surfing around your blog from my new iphone! Just wanted to say I love reading your blog and look forward to all your posts! Carry on the excellent work!

    • Cortney 1:09 pm on July 27, 2014 Permalink | Reply

      Thanks for ones marvelous posting! I genuinely enjoyed reading it, you are a great author. I will make certain to bookmark your blog and definitely will come back later in life. I want to encourage that you continue your great writing, have a nice evening!

    • Aidan 1:14 pm on July 27, 2014 Permalink | Reply

      Spot on with this write-up, I truly feel this amazing site needs a lot more attention. I’ll probably be returning to read more, thanks for the advice!

    • Jeffry 1:15 pm on July 27, 2014 Permalink | Reply

      Good day! Do you use Twitter? I’d like to follow you if that would be ok. I’m absolutely enjoying your blog and look forward to new posts.

    • Lavonne 1:15 pm on July 27, 2014 Permalink | Reply

      Hi just wanted to give you a quick heads up and let you know a few of the images aren’t loading properly. I’m not sure why but I think its a linking issue. I’ve tried it in two different browsers and both show the same outcome.

    • Robbie 1:19 pm on July 27, 2014 Permalink | Reply

      When some one searches for his vital thing, thus he/she desires to be available that in detail, thus that thing is maintained over here.

    • Dan 1:32 pm on July 27, 2014 Permalink | Reply

      An interesting discussion is worth comment. I do think that you ought to write more about this issue, it may not be a taboo matter but usually people do not talk about such issues. To the next! Cheers!!

    • Lettie 1:42 pm on July 27, 2014 Permalink | Reply

      I know this if off topic but I’m looking into starting my own weblog and was wondering what all is required to get set up? I’m assuming having a blog like yours would cost a pretty penny? I’m not very web savvy so I’m not 100% sure. Any recommendations or advice would be greatly appreciated. Thanks

    • Will 1:48 pm on July 27, 2014 Permalink | Reply

      Hello just wanted to give you a quick heads up and let you know a few of the images aren’t loading correctly. I’m not sure why but I think its a linking issue. I’ve tried it in two different internet browsers and both show the same outcome.

    • Fatima 12:11 pm on July 29, 2014 Permalink | Reply

      Having read this I thought it was rather informative. I appreciate you finding the time and energy to put this article together. I once again find myself personally spending a significant amount of time both reading and leaving comments. But so what, it was still worth it!

    • Celia 12:57 pm on August 27, 2014 Permalink | Reply

      I like the valuable info you provide in your articles. I will bookmark your weblog and check again here frequently. I’m quite sure I’ll learn lots of new stuff right here! Good luck for the next!

  • tuxdna 8:54 pm on February 3, 2014 Permalink | Reply  

    A simple Scala parser to parse 44GB Wikipedia XML Dump 

    I had to parse a Wikipedia XML Dump ( 44GB XML file uncompressed ). The XML dump is available here, and I have also created a smaller sample file to run this code: sample wiki.xml file.

    Below is the XML event based parser using Scala’s XMLEventReader:

    package xml
    import scala.io.Source
    import scala.xml.pull._
    import scala.collection.mutable.ArrayBuffer
    import java.io.File
    import java.io.FileOutputStream
    import scala.xml.XML
    object wikipedia extends App {
      val xmlFile = args(0)
      val outputLocation = new File(args(1))
      val xml = new XMLEventReader(Source.fromFile(xmlFile))
      var insidePage = false
      var buf = ArrayBuffer[String]()
      for (event <- xml) {
        event match {
          case EvElemStart(_, "page", _, _) => {
            insidePage = true
            val tag = "<page>"
            buf += tag
          case EvElemEnd(_, "page") => {
            val tag = "</page>"
            buf += tag
            insidePage = false
          case e @ EvElemStart(_, tag, _, _) => {
            if (insidePage) {
              buf += ("<" + tag + ">")
          case e @ EvElemEnd(_, tag) => {
            if (insidePage) {
              buf += ("</" + tag + ">")
          case EvText(t) => {
            if (insidePage) {
              buf += (t)
          case _ => // ignore
      def writePage(buf: ArrayBuffer[String]) = {
        val s = buf.mkString
        val x = XML.loadString(s)
        val pageId = (x \ "id")(0).child(0).toString
        val f = new File(outputLocation, pageId + ".xml")
        println("writing to: " + f.getAbsolutePath())
        val out = new FileOutputStream(f)

    Find this code snippet on Github

    Lets see how long it takes to process all the Wikipedia pages in the 44GB XML Dump.

    It took roughly 7 hours 30 minutes. Thats not bad:

    $ time sbt "run-main xml.wikipedia enwiki-20140102-pages-articles-multistream.xml wiki-pages/"
    [success] Total time: 26918 s, completed Feb 4, 2014 9:56:38 AM
    real	448m41.888s
    user	82m47.594s
    sys	192m46.238s

    And it generated 14128976 XML files:

    $ ls wiki-pages/ | wc -l
    $ du -sh wiki-pages/ 
    80G	wiki-pages/

    Now as you can see that 44GB uncompressed XML file got split up onto 80GB of total storage for all the separate pages. Now that’s something to be worked on.


    First steps with Scala: XML pull parsing

    Scala finding elements in big (30MB) xml files

  • tuxdna 4:31 am on November 10, 2013 Permalink | Reply  

    JMILUG Meetup – 9th November 2013 



    • Hammad Haleem
    • Saleem Ansari
    • Pankaj Sharma
    • Safiyat Reza
    • Umar Ahmad
    • Vivek Gupta
    • Sawood Alam
    • Viupl Nayyar
    • Amit Shah

    There wasn’t a pre-defined agenda so the discussion took its own course. We discussed about many things:

    • Sawood Alam shared his work he is doing in his research group at the Old Dominion University.
    • Vivek Gupta shared what kind of challenging problems he is working on.
    • Vipul Nayyar shared his GSoC experience during RTEMS project.
    • There was a discussion about patents, patent laws, usage of search engines etc.
    • How to approach for further studies, programming competitions and some general chit-chat.

    In this meetup some members met after a very long time, so it was only a get together. No tasks assigned in particular.

  • tuxdna 8:25 pm on August 25, 2013 Permalink | Reply  

    Attending ScalaTraits 2013 event in New Delhi 

    First of all I was surprised that an event specifically targeted towards Scala was happening in India and very fortunately in New Delhi itself. I attend the ScalaTraits 2013 event in New Delhi a couple of days back.

    The event was put together by Knoldus, a company specializing exclusively in Scala and related technologies.

    Goodies at ScalaTraits 2013

    The agenda was like this:

    Introductory talk

    This talk was an introduction to the Scala ecosystem as a whole by Vikas Hazrati. He discussed some history, which Scala technologies are currently popular and which companies are using them.

    Kick start to Scala by Sanjeev Kumar

    This was a 3 hour session on some core concepts in Scala, followed by setting up Scala IDE bundle and using Scala Worksheets to try out some cool examples.

    Some of the topics I remember right away are Functional Programming, Equational Reasoning, Functional Language features ( functions are fist class values, it encourages immutability ), every statement has a return value ( and a type ), compound expression has a return type as well, Type inference, Classes and Objects, Class Inheritance, Default constructor, Predef object, Case classes, Functional Objects ( those objects that do not have mutable state ), File processing etc.

    Kick start to Play Neeklanth Sachdeva

    This was a 3 hour session in which we learnt how Play is an MVC web framework ( quite a lot like Ruby on Rails actually ). We setup Play development environment, created a sample Play app with database connectivity, some routes, and a basic HTML view. Then we deployed it on Heroku.

    All in all, it was a good learning experience with on-the-spot hands-on exercises. Other than that, the venue was very nice, with great food and pleasant team at work. Bravo!

  • tuxdna 7:54 pm on August 25, 2013 Permalink | Reply  

    Deleting lots of spam content on a Drupal website 

    Deleting Spam on FUDCON.in website

    After FUDCon Pune event in 2011, the website has been running as is. Just a couple of days back, I noticed a lot of spam accumulated on the website. However it is that content which is not displayed on the website, unless you know its URL. I located the last known sane activity and began estimating how much spam content I have to delete.

    Here I use Drush and a simple PHP script.

    $ drush sql-cli
    mysql> select unix_timestamp('2012-02-28 19:57:11 +0530') from dual;
    | unix_timestamp('2012-02-28 19:57:11 +0530') |
    |                                  1330487831 | 
    mysql> SELECT count(*) FROM node AS n WHERE n.type = 'session' and  n.created > 1330487831  ;
    | count(*) |
    |    22110 | 
    1 row in set (0.22 sec)

    Twenty two thounsand plus entries! That is so much content to be deleted from the Admin UI.

    A simple solution was to script it.

    $ cat delete_spam.php
      require_once './includes/bootstrap.inc';
      global $user;
      $original_user = $user;
      $user = user_load(1);
      echo $user->uid . " " . $user->mail;
      echo "\n";
      // $aquery= db_query("SELECT nid FROM {node} AS n WHERE n.type = 'session' and n.created > 1330487831");
      $aquery= db_query("SELECT nid FROM {node} AS n WHERE n.type = 'session' and n.nid >= 315");
      while ($row = db_fetch_object($aquery)) {
        // node_delete($row->nid);
        $nid = $row->nid;
        $node = node_load(array("nid" => $nid));
        echo "Deleting " . $nid . ": " . $node->title . "\n" ;
      $user = $original_user;

    Now we can execute this script:

    $ drush php-script delete_spam.php | tee delete_spam.out
    $ wc -l delete_spam.out
    22110 delete_spam.out

    Now, all 22110 entries were deleted!

    Next step is to clean the old cache as well:

    $ drush cc
    Enter a number to choose which cache to clear.
     [0]  :  Cancel         
     [1]  :  all            
     [2]  :  drush          
     [3]  :  theme-registry 
     [4]  :  menu           
     [5]  :  css-js         
     [6]  :  block          
     [7]  :  module-list    
     [8]  :  theme-list     
     [9]  :  nodeaccess     
    'all' cache was cleared                   [success]

    Thats how I deleted so much spam content quickly.


  • tuxdna 2:04 pm on August 7, 2013 Permalink | Reply
    Tags: , rdesktop   

    Remote Desktop from a Linux client machine 

    Connecting to Remote Desktop from Linux machine is easy. Invoke the following command

    rdesktop -r sound=local -r clipboard:CLIPBOARD -z -g '80%' -a 15 -u user.name -p - -d MYDOMAIN  remote.hostname.com

    Above command does the following:

    • Forwards remote sound to local machine
    • Enables clipboard sharing
    • Uses compression
    • Makes the remote desktop screen to 80% of the local machine’s screen
    • Uses 15bit color depth on the remote desktop
    • With user name user.name and password taken from STDIN
    • Connects to remote.hostname.com at domain DOMAINNAME

    Thats it!

    EDIT: Updated the explanation in the order of CLI options to the rdesktop command.

    • chuck 3:09 pm on August 7, 2013 Permalink | Reply

      Thanks for the tip. One little favor to ask of you though. When you explain what the command does, can you put the explanation in the order of the command? For example, in the command to set the color depth option is near the middle of command line sequence but at the end of your explanation list.

    • tuxdna 7:40 pm on August 7, 2013 Permalink | Reply


      Thank you so much for your feedback. I updated the post as per your suggestion. 🙂

    • tuxdna 7:06 am on August 19, 2013 Permalink | Reply

      I also like to fit the rdesktop window into the available space in the screen. For example, if I want to fit to the size of Gnome Terminal, I use xwininfo:

      $ xwininfo

      xwininfo: Please select the window about which you
      would like information by clicking the
      mouse in that window.

      xwininfo: Window id: 0x3800004 “/bin/bash”

      Absolute upper-left X: 0
      Absolute upper-left Y: 47
      Relative upper-left X: 0
      Relative upper-left Y: 22
      Width: 1280
      Height: 936
      Depth: 32
      Visual: 0x67
      Visual Class: TrueColor
      Border width: 0
      Class: InputOutput
      Colormap: 0x3800003 (not installed)
      Bit Gravity State: NorthWestGravity
      Window Gravity State: NorthWestGravity
      Backing Store State: NotUseful
      Save Under State: no
      Map State: IsViewable
      Override Redirect State: no
      Corners: +0+47 -0+47 -0-41 +0-41
      -geometry 157×53+0+25

      and then specify the geometry as

      $ rdesktop -g “1280×936”

    • tuxdna 9:23 am on August 19, 2013 Permalink | Reply

      Rdesktop clipboard wasn’t working well when on Ubuntu. As it turns out its actually a problem with the remote windows machine. Here is the solution:


      • kill rdpclip.exe ( on remote machine )
      • exit rdesktop
      • rerun rdesktop ( this launches rdpclip.exe automatically )
  • tuxdna 9:46 pm on July 30, 2013 Permalink | Reply  

    Contiuned: Juniper Networks VPN from Fedora 64bit 

    This post is the continuation of my earlier post about Juniper VPN.

    In the earlier post, I connected to VPN using a login/password/certification combination. Now I also managed to use the ncui tool for the connection which is based on a cookie value and a certificate. I wasn’t able to connect to this configuration using the method in my previous post.

    $ ./ncui -h vpn.example.com -c DSID="YOUR_DSID_COOKIE" -f vpn.example.com-cert.der

    Here, first you need to login to your vpn domain from a web-browser. Once you do that, you need to obtain two things:

    • DSID cookie value. This can be easily obtained from a web-browser. You only need to browse the relevant cookie value in preferences or through “View Cookies” from your web-browser.

    For more detailed information I have listed the installation steps in a gist. Search for “Alternative method using ncui” at the bottom of this gist.

    Also I came across two noticeable tools for Juniper VPN:

    • jvpn: a nice tool which automates many of the steps.

    Again, I got rid of Windows dependency! 🙂


  • tuxdna 1:50 am on July 21, 2013 Permalink | Reply
    Tags: , openstack   

    Setting up OpenStack on Fedora 19 is a lot of work 

    I wanted to experiment with creating a Fedora 19 compute node on Fedora 19 + OpenStack. However it seems there are a bunch of issues which need to be fixed. The issues and solutions are already recorded by many people.

    I list the highlights:

    • MySQL Server in Fedora 19 is actually MariaDB Server
    • Keystone log file needs to be chowned to keyston:keystone
    • Fedora 19 doesnt have kvm.modules file at the expected location
    • I saw atleast one error due to selinux

    I am recording the errors, commands and references in the following gist on github.com: openstack-fedora19.md


    Finally I managed to complete the setup. Now its time to launch some VMs.!

  • tuxdna 1:25 pm on July 20, 2013 Permalink | Reply
    Tags: ,   

    Emacs fullscreen and Fedora 19 

    I was using fullscreen.el for so long but now that doesn’t seem to work on Fedora 19 / GNOME 3.8.1.

    What to do? Following are the steps I did for now.

    First install wmctrl:

    $ sudo yum install wmctrl

    Now add following code to your .emacs configuration file:

    (defun switch-full-screen ()
      (shell-command "wmctrl -r :ACTIVE: -btoggle,fullscreen"))
    (global-set-key [f11] 'switch-full-screen)

    Restart Emacs and press F11. Thats it!

    • jmt 2:45 pm on July 20, 2013 Permalink | Reply

      Emacs doesn’t even go full screen using “emacs -fs” from the shell.

    • jmt 2:50 pm on July 20, 2013 Permalink | Reply

      Seems to be a Gnome 3 and/or GTK3 problem since this works fine on the Mate Desktop.

      Thanks for the workaround.

      • tuxdna 6:36 pm on July 20, 2013 Permalink | Reply

        @jmt: Yes, it does’t. I too tried “emacs -fs”. Surely, something to do with GTK3/GNOME3.

        Thanks 🙂

    • Alexander Kahl 8:17 am on July 21, 2013 Permalink | Reply

      Just what I needed, thanks a lot!

    • Dag 12:28 pm on July 28, 2013 Permalink | Reply

      There’s a configurable keyboard shortcut for “Toggle fullscreen mode” in GNOME Settings that works with most applications/windows.

  • tuxdna 10:36 pm on July 19, 2013 Permalink | Reply
    Tags: , ghostscript, ocr, pdf, tesseract   

    Extract Text from from multi-page PDF with only Images 

    Sometimes there are only images in a PDF. In such cases you can not select text to copy / paste or just for reference.

    To extract text from an Image or a PDF containing only images, I used Tesseract OCR Engine and Ghostscript. I am running Fedora 19 at the moment, however these steps should apply to an older version of Fedora or Ubuntu. ( I believe this can be done on Windows as well ). Both Tesseract and Ghostscript are free softwares.

    First, install both Tesseract and Ghostscript on Fedora:

    $ sudo yum install -y ghostscript tesseract

    Now go to the folder where your PDF is located ( assuming that it is named as story.pdf ):

    $ cd ~/Downloads/

    Next, extract each page from PDF as a PNG. For this I used Ghostscript. Note the resolution ( -r300 ):

    $ ghostscript -dNOPAUSE -dBATCH -sDEVICE=pngalpha -r300 -sOutputFile="page%03d".png story.pdf
    $ ls page*.png

    Once we have a PNG for each page, we can use the OCR software to extract text:

    $ for f in page*.png ; do tesseract $f $f.out; done
    $ ls page*.out.txt

    So, now we have all the text from images into text files. Tesseract works quite well with OCR output, and obviously it cant read drawing or misprinted characters quite well, still its quite accurate.

    I hope it is helpful for you.


    • Nana111 3:29 am on December 25, 2013 Permalink | Reply

      HI there
      Thanks for your sharing.It is really helpful for me.I am looking for the method for extracting page from PDF files.I have tried to do that using this PDF program:
      But it can not work in my computer.I don’t know why.
      Thanks for your answers.I want to know that if there is a free trial in your program.Thanks a lot.

    • tuxdna 1:37 pm on January 2, 2014 Permalink | Reply

      @Nana111 All the tools I mentioned in this post are free. You can try them on your own. For this you would specificall need Fedora ( a GNU/Linux distribution ) which you can download and install from here: https://fedoraproject.org/get-fedora

      Once you do that you can install the software as I explained above. Let me know how where you are stuck with using it, if at all.

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc