Code optimisation approach

Discussion in 'Programming & Software Development' started by jars121, Aug 7, 2017.

  1. jars121

    jars121 Member

    Joined:
    Mar 6, 2008
    Messages:
    1,775
    Location:
    Sydney
    Morning all,

    Here's the situation. I've spent the last ~4 months building an application for an embedded device. This has been a mixture of writing the application itself, designing the device hardware, enclosure, connectors, etc., and making a test bench.

    I've spent the last ~3 weeks building a custom Linux OS, as well as configured Qt, Python, SIP, etc. libraries from source. I'm now at the point where my application runs on my prototype embedded device without an X server, displaying directly on the frame buffer with EGLFS.

    The issue: even with EGLFS, the frame rate/performance is abysmal. I haven't yet done a complete profile of my application, but I'm starting to think that Python just won't be fast enough for this particular application.

    My options:
    1. Try compiling my Python application using nuitka (for example)
    2. Profile the current application, rewrite the intensive components in C++ and bind the two languages together
    3. Rewrite the entire application in native C++

    I've had a cursory look over and attempt at compiling with nuitka, which threw a myriad of errors at me. (Basically every linked module wasn't able to be found/imported.) I'm not opposed to writing components in C++ and binding them to the main Python application, but I'm not sure if this is a viable, non-bandaid solution. If I had my time again I would have written the application in C++ from day one, but my language of choice is Python, with my GUI platform of choice being PySide/PyQt.

    If rewriting the application in C++ is going to solve all my performance issues in one go, I may just bite the bullet and do so.

    Can anyone recommend anything I may have missed?

    PS. The application was originally written in PySide (Qt4-based), which used the standard Linux X server for window management. On an ARM-based embedded device, X consumes considerable resources. As such, I ported my application to PyQt5 (Qt5-based), which supports a Qt configured for EGLFS without the X server. I expected the performance to be significantly faster without X, but it seems the same if not even slower?! I wonder if hardware acceleration is even being used, as the performance would certainly suggested software rendering...
     
  2. GumbyNoTalent

    GumbyNoTalent Member

    Joined:
    Jan 8, 2003
    Messages:
    6,081
    Location:
    Briz Vegas
    Find another open project using the same display lib and run that on your hardware ti test that your hardware is capable of the performance you want.

    This way you will have a benchmark to compare source trees against each other.
     
  3. OP
    OP
    jars121

    jars121 Member

    Joined:
    Mar 6, 2008
    Messages:
    1,775
    Location:
    Sydney
    That's a very good idea, thanks :)
     
  4. Foliage

    Foliage Member

    Joined:
    Jan 22, 2002
    Messages:
    32,004
    Location:
    Sleepwithyourdadelaide
    What are your requirements. Can you change the embedded devices hardware?

    There are multiple approaches, just buy more powerful hardware and don't touch your code. If you do this you have a working product very quickly, but it costs more.

    Redoing the code in c++ could take you months and may or may not have acceptable performance, how much is your time worth to you?

    Need to weigh these things up.
     
  5. deepspring

    deepspring Member

    Joined:
    Jul 8, 2002
    Messages:
    3,601
    Location:
    Maitland, NSW
    It's a long shot, but you could try either Cython or PyPy.

    Both projects are supposed to offer better performance over ordinary Python.
     
  6. waltermitty

    waltermitty Member

    Joined:
    Feb 19, 2016
    Messages:
    728
    Location:
    BRISBANE
    rewrite in rust
     
  7. Foliage

    Foliage Member

    Joined:
    Jan 22, 2002
    Messages:
    32,004
    Location:
    Sleepwithyourdadelaide
    What is the embedded support of rust like?
     
  8. waltermitty

    waltermitty Member

    Joined:
    Feb 19, 2016
    Messages:
    728
    Location:
    BRISBANE
    Good
     
  9. FOXH0UND

    FOXH0UND Member

    Joined:
    Apr 4, 2011
    Messages:
    3,915
    Location:
    Melbourne
    How did you end up going with this?
     
  10. deepspring

    deepspring Member

    Joined:
    Jul 8, 2002
    Messages:
    3,601
    Location:
    Maitland, NSW
  11. cvidler

    cvidler Member

    Joined:
    Jun 29, 2001
    Messages:
    11,354
    Location:
    Canberra
    What's the hardware specs like?

    I've only played around with Raspberry Pi's and arduinos, even the Pi3 with all it's grunt runs python slow (but it seems python is what everyone uses on pi's). python is imho a rubbish lanuage for embedded style hardware. I currently use a Pi3 to run a GPS PPS time server, I added a little 128x64 OLED screen for monitoring the time/GPS status, I update that at 10Hz, and that process alone takes up 12% CPU usage. cgps and ntpd/ntpq both running to the same schedule consume an unnoticeable amount of CPU%. I'd fix it, but <15% total CPU usage is fine for me.

    you don't have many CPU cycles to play with, so wasting them running an interpreted language is silly.
     
  12. Foliage

    Foliage Member

    Joined:
    Jan 22, 2002
    Messages:
    32,004
    Location:
    Sleepwithyourdadelaide
    Anything serious micro related is almost always written in C/C++ for the memory and cpu footprint. The biggest one is simply power consumption.
     
  13. GumbyNoTalent

    GumbyNoTalent Member

    Joined:
    Jan 8, 2003
    Messages:
    6,081
    Location:
    Briz Vegas
  14. OP
    OP
    jars121

    jars121 Member

    Joined:
    Mar 6, 2008
    Messages:
    1,775
    Location:
    Sydney
    Sorry for the delay in responding everyone, I've been very busy tinkering away!

    A quick update. I ended up biting the bullet, and have started from scratch in c++. Having only very limited exposure to c++ (I read a lot of Qt documentation when building PyQt/PySide applications), the learning curve has been quite steep, as the approach to a compiled language is entirely different to that of an interpreted one such as Python.

    I've now built about 80% of the functionality I had in Python in the c++ application, and have made considerable logic and GUI improvements along the way as well. It's amazing how efficient one can be when building off a template.

    I'm yet to deploy to my ARM device (prototyping with a RPi2, with the aim of either building a CM3 carrier board/device or going to an iMX-based SoM), but I'm confident I'll have ample performance available now as I've completely split the logic (all in c++) from the GUI (all hardware-accelerated in QML).
     

Share This Page