To protect your data, the CISO officer has suggested users to enable 2FA as soon as possible.
Currently 2.7% of users enabled 2FA.

Commit 4b732b03 authored by mattip's avatar mattip
Browse files

merge default into branch

parents 25489886 4c4200aa
......@@ -12,3 +12,7 @@ ab0dd631c22015ed88e583d9fdd4c43eebf0be21 pypy-2.1-beta1-arm
32f35069a16d819b58c1b6efb17c44e3e53397b2 release-2.3.1
10f1b29a2bd21f837090286174a9ca030b8680b2 release-2.5.0
8e24dac0b8e2db30d46d59f2c4daa3d4aaab7861 release-2.5.1
8e24dac0b8e2db30d46d59f2c4daa3d4aaab7861 release-2.5.1
0000000000000000000000000000000000000000 release-2.5.1
0000000000000000000000000000000000000000 release-2.5.1
e3d046c43451403f5969580fc1c41d5df6c4082a release-2.5.1
......@@ -174,6 +174,38 @@ interpreter and other ones might have slightly different needs.
User Guide
==========
How to write multithreaded programs: the 10'000-feet view
---------------------------------------------------------
PyPy-STM offers two ways to write multithreaded programs:
* the traditional way, using the ``thread`` or ``threading`` modules,
described first__.
* using ``TransactionQueue``, described next__, as a way to hide the
low-level notion of threads.
.. __: `Drop-in replacement`_
.. __: `transaction.TransactionQueue`_
The issue with low-level threads are well known (particularly in other
languages that don't have GIL-based interpreters): memory corruption,
deadlocks, livelocks, and so on. There are alternative approaches to
dealing directly with threads, like OpenMP_. These approaches
typically enforce some structure on your code. ``TransactionQueue``
is in part similar: your program needs to have "some chances" of
parallelization before you can apply it. But I believe that the scope
of applicability is much larger with ``TransactionQueue`` than with
other approaches. It usually works without forcing a complete
reorganization of your existing code, and it works on any Python
program which has got *latent* and *imperfect* parallelism. Ideally,
it only requires that the end programmer identifies where this
parallelism is likely to be found, and communicates it to the system
using a simple API.
.. _OpenMP: http://en.wikipedia.org/wiki/OpenMP
Drop-in replacement
-------------------
......@@ -196,31 +228,6 @@ appear to behave as if they were completely run in this serialization
order.
How to write multithreaded programs: the 10'000-feet view
---------------------------------------------------------
PyPy-STM offers two ways to write multithreaded programs:
* the traditional way, using the ``thread`` or ``threading`` modules.
* using ``TransactionQueue``, described next__, as a way to hide the
low-level notion of threads.
.. __: `transaction.TransactionQueue`_
``TransactionQueue`` hides the hard multithreading-related issues that
we typically encounter when using low-level threads. This is not the
first alternative approach to avoid dealing with low-level threads;
for example, OpenMP_ is one. However, it is one of the first ones
which does not require the code to be organized in a particular
fashion. Instead, it works on any Python program which has got
*latent* and *imperfect* parallelism. Ideally, it only requires that
the end programmer identifies where this parallelism is likely to be
found, and communicates it to the system using a simple API.
.. _OpenMP: http://en.wikipedia.org/wiki/OpenMP
transaction.TransactionQueue
----------------------------
......@@ -256,8 +263,9 @@ interleaved with each other just because they run in parallel. The
behavior did not change because we are using ``TransactionQueue``.
All the calls still *appear* to execute in some serial order.
However, the performance typically does not increase out of the box.
In fact, it is likely to be worse at first. Typically, this is
A typical usage of ``TransactionQueue`` goes like that: at first,
the performance does not increase.
In fact, it is likely to be worse. Typically, this is
indicated by the total CPU usage, which remains low (closer to 1 than
N cores). First note that it is expected that the CPU usage should
not go much higher than 1 in the JIT warm-up phase: you must run a
......@@ -282,9 +290,9 @@ aborted (which caused another 1.25 seconds of lost time by pausing),
because of the reason shown in the two independent single-entry
tracebacks: one thread ran the line ``someobj.stuff = 5``, whereas
another thread concurrently ran the line ``someobj.other = 10`` on the
same object. Two writes to the same object cause a conflict, which
aborts one of the two transactions. In the example above this
occurred 12412 times.
same object. These two writes are done to the same object. This
causes a conflict, which aborts one of the two transactions. In the
example above this occurred 12412 times.
The two other conflict sources are ``STM_CONTENTION_INEVITABLE``,
which means that two transactions both tried to do an external
......@@ -303,7 +311,7 @@ Common causes of conflicts:
each transaction starts with sending data to a log file. You should
refactor this case so that it occurs either near the end of the
transaction (which can then mostly run in non-inevitable mode), or
even delegate it to a separate thread.
delegate it to a separate transaction or even a separate thread.
* Writing to a list or a dictionary conflicts with any read from the
same list or dictionary, even one done with a different key. For
......@@ -322,7 +330,7 @@ Common causes of conflicts:
results is fine, use ``transaction.time()`` or
``transaction.clock()``.
* ``transaction.threadlocalproperty`` can be used as class-level::
* ``transaction.threadlocalproperty`` can be used at class-level::
class Foo(object): # must be a new-style class!
x = transaction.threadlocalproperty()
......@@ -342,11 +350,11 @@ Common causes of conflicts:
threads, each running the transactions one after the other; such
thread-local properties will have the value last stored in them in
the same thread,, which may come from a random previous transaction.
``threadlocalproperty`` is still useful to avoid conflicts from
cache-like data structures.
This means that ``threadlocalproperty`` is useful mainly to avoid
conflicts from cache-like data structures.
Note that Python is a complicated language; there are a number of less
common cases that may cause conflict (of any type) where we might not
common cases that may cause conflict (of any kind) where we might not
expect it at priori. In many of these cases it could be fixed; please
report any case that you don't understand. (For example, so far,
creating a weakref to an object requires attaching an auxiliary
......@@ -395,8 +403,8 @@ threads can progress or not are rather complicated; you have to consider
it likely that such a piece of code will eventually block all other
threads anyway.
Note that if you want to experiment with ``atomic``, you may have to add
manually a transaction break just before the atomic block. This is
Note that if you want to experiment with ``atomic``, you may have to
manually add a transaction break just before the atomic block. This is
because the boundaries of the block are not guaranteed to be the
boundaries of the transaction: the latter is at least as big as the
block, but may be bigger. Therefore, if you run a big atomic block, it
......@@ -522,7 +530,7 @@ parallelize almost freely (as long as it's not some artificial example
where, say, all threads try to increase the same global counter and do
nothing else).
However, using if the program requires longer transactions, it comes
However, if the program requires longer transactions, it comes
with less obvious rules. The exact details may vary from version to
version, too, until they are a bit more stabilized. Here is an
overview.
......
......@@ -39,8 +39,6 @@ class Module(MixedModule):
'error': 'space.fromcache(interp_pyexpat.Cache).w_error',
'__version__': 'space.wrap("85819")',
'EXPAT_VERSION': 'interp_pyexpat.get_expat_version(space)',
'version_info': 'interp_pyexpat.get_expat_version_info(space)',
}
submodules = {
......@@ -53,3 +51,9 @@ class Module(MixedModule):
'XML_PARAM_ENTITY_PARSING_ALWAYS']:
interpleveldefs[name] = 'space.wrap(interp_pyexpat.%s)' % (name,)
def startup(self, space):
from pypy.module.pyexpat import interp_pyexpat
w_ver = interp_pyexpat.get_expat_version(space)
space.setattr(self, space.wrap("EXPAT_VERSION"), w_ver)
w_ver = interp_pyexpat.get_expat_version_info(space)
space.setattr(self, space.wrap("version_info"), w_ver)
......@@ -118,8 +118,7 @@ gets copied into the attribute ``concretetype`` of the Variables that have been
given this representation. The RTyper also computes a ``concretetype`` for
Constants, to match the way they are used in the low-level operations (for
example, ``int_add(x, 1)`` requires a ``Constant(1)`` with
``concretetype=Signed``, but an untyped ``add(x, 1)`` works with a
``Constant(1)`` that must actually be a PyObject at run-time).
``concretetype=Signed``).
In addition to ``lowleveltype``, each Repr subclass provides a set of methods
called ``rtype_op_xxx()`` which define how each high-level operation ``op_xxx``
......@@ -306,14 +305,14 @@ Pointer Types
~~~~~~~~~~~~~
As in C, pointers provide the indirection needed to make a reference modifiable
or sharable. Pointers can only point to a structure, an array, a function
(see below) or a PyObject (see below). Pointers to primitive types, if needed,
must be done by pointing to a structure with a single field of the required
type. Pointer types are declared by::
or sharable. Pointers can only point to a structure, an array or a function
(see below). Pointers to primitive types, if needed, must be done by pointing
to a structure with a single field of the required type. Pointer types are
declared by::
Ptr(TYPE)
At run-time, pointers to GC structures (GcStruct, GcArray and PyObject) hold a
At run-time, pointers to GC structures (GcStruct, GcArray) hold a
reference to what they are pointing to. Pointers to non-GC structures that can
go away when their container is deallocated (Struct, Array) must be handled
with care: the bigger structure of which they are part of could be freed while
......@@ -356,22 +355,6 @@ function in a way that isn't fully specified now, but the following attributes
:graph: the flow graph of the function.
The PyObject Type
~~~~~~~~~~~~~~~~~
This is a special type, for compatibility with CPython: it stands for a
structure compatible with PyObject. This is also a "container" type (thinking
about C, this is ``PyObject``, not ``PyObject*``), so it is usually manipulated
via a Ptr. A typed graph can still contain generic space operations (add,
getitem, etc.) provided they are applied on objects whose low-level type is
``Ptr(PyObject)``. In fact, code generators that support this should consider
that the default type of a variable, if none is specified, is ``Ptr(PyObject)``.
In this way, they can generate the correct code for fully-untyped flow graphs.
The testing implementation allows you to "create" PyObjects by calling
``pyobjectptr(obj)``.
Opaque Types
~~~~~~~~~~~~
......
......@@ -44,7 +44,10 @@ class EmptyProfiler(BaseProfiler):
pass
def get_counter(self, num):
return -1.0
return 0
def get_times(self, num):
return 0.0
class Profiler(BaseProfiler):
initialized = False
......@@ -109,6 +112,9 @@ class Profiler(BaseProfiler):
return self.cpu.tracker.total_freed_bridges
return self.counters[num]
def get_times(self, num):
return self.times[num]
def count_ops(self, opnum, kind=Counters.OPS):
from rpython.jit.metainterp.resoperation import rop
self.counters[kind] += 1
......
......@@ -5,7 +5,7 @@ from rpython.jit.metainterp.test.support import LLJitMixin
from rpython.jit.codewriter.policy import JitPolicy
from rpython.jit.metainterp.resoperation import rop
from rpython.rtyper.annlowlevel import hlstr
from rpython.jit.metainterp.jitprof import Profiler
from rpython.jit.metainterp.jitprof import Profiler, EmptyProfiler
class JitHookInterfaceTests(object):
......@@ -178,6 +178,20 @@ class JitHookInterfaceTests(object):
self.meta_interp(main, [], ProfilerClass=Profiler)
def test_get_stats_empty(self):
driver = JitDriver(greens = [], reds = ['i'])
def loop(i):
while i > 0:
driver.jit_merge_point(i=i)
i -= 1
def main():
loop(30)
assert jit_hooks.stats_get_counter_value(None,
Counters.TOTAL_COMPILED_LOOPS) == 0
assert jit_hooks.stats_get_times_value(None, Counters.TRACING) == 0
self.meta_interp(main, [], ProfilerClass=EmptyProfiler)
class LLJitHookInterfaceTests(JitHookInterfaceTests):
# use this for any backend, instead of the super class
......
......@@ -52,6 +52,9 @@ Environment variables can be used to fine-tune the following parameters:
# XXX total addressable size. Maybe by keeping some minimarkpage arenas
# XXX pre-reserved, enough for a few nursery collections? What about
# XXX raw-malloced memory?
# XXX try merging old_objects_pointing_to_pinned into
# XXX old_objects_pointing_to_young (IRC 2014-10-22, fijal and gregor_w)
import sys
from rpython.rtyper.lltypesystem import lltype, llmemory, llarena, llgroup
from rpython.rtyper.lltypesystem.lloperation import llop
......@@ -63,6 +66,7 @@ from rpython.rlib.rarithmetic import ovfcheck, LONG_BIT, intmask, r_uint
from rpython.rlib.rarithmetic import LONG_BIT_SHIFT
from rpython.rlib.debug import ll_assert, debug_print, debug_start, debug_stop
from rpython.rlib.objectmodel import specialize
from rpython.memory.gc.minimarkpage import out_of_memory
#
# Handles the objects in 2 generations:
......@@ -471,10 +475,10 @@ class IncrementalMiniMarkGC(MovingGCBase):
# the start of the nursery: we actually allocate a bit more for
# the nursery than really needed, to simplify pointer arithmetic
# in malloc_fixedsize(). The few extra pages are never used
# anyway so it doesn't even counct.
# anyway so it doesn't even count.
nursery = llarena.arena_malloc(self._nursery_memory_size(), 0)
if not nursery:
raise MemoryError("cannot allocate nursery")
out_of_memory("cannot allocate nursery")
return nursery
def allocate_nursery(self):
......@@ -685,23 +689,48 @@ class IncrementalMiniMarkGC(MovingGCBase):
def collect_and_reserve(self, totalsize):
"""To call when nursery_free overflows nursery_top.
First check if the nursery_top is the real top, otherwise we
can just move the top of one cleanup and continue
Do a minor collection, and possibly also a major collection,
and finally reserve 'totalsize' bytes at the start of the
now-empty nursery.
First check if pinned objects are in front of nursery_top. If so,
jump over the pinned object and try again to reserve totalsize.
Otherwise do a minor collection, and possibly a major collection, and
finally reserve totalsize bytes.
"""
minor_collection_count = 0
while True:
self.nursery_free = llmemory.NULL # debug: don't use me
# note: no "raise MemoryError" between here and the next time
# we initialize nursery_free!
if self.nursery_barriers.non_empty():
# Pinned object in front of nursery_top. Try reserving totalsize
# by jumping into the next, yet unused, area inside the
# nursery. "Next area" in this case is the space between the
# pinned object in front of nusery_top and the pinned object
# after that. Graphically explained:
#
# |- allocating totalsize failed in this area
# | |- nursery_top
# | | |- pinned object in front of nursery_top,
# v v v jump over this
# +---------+--------+--------+--------+-----------+ }
# | used | pinned | empty | pinned | empty | }- nursery
# +---------+--------+--------+--------+-----------+ }
# ^- try reserving totalsize in here next
#
# All pinned objects are represented by entries in
# nursery_barriers (see minor_collection). The last entry is
# always the end of the nursery. Therefore if nursery_barriers
# contains only one element, we jump over a pinned object and
# the "next area" (the space where we will try to allocate
# totalsize) starts at the end of the pinned object and ends at
# nursery's end.
#
# find the size of the pinned object after nursery_top
size_gc_header = self.gcheaderbuilder.size_gc_header
pinned_obj_size = size_gc_header + self.get_size(
self.nursery_top + size_gc_header)
#
# update used nursery space to allocate objects
self.nursery_free = self.nursery_top + pinned_obj_size
self.nursery_top = self.nursery_barriers.popleft()
else:
......@@ -729,6 +758,9 @@ class IncrementalMiniMarkGC(MovingGCBase):
"Seeing minor_collection() at least twice."
"Too many pinned objects?")
#
# Tried to do something about nursery_free overflowing
# nursery_top before this point. Try to reserve totalsize now.
# If this succeeds break out of loop.
result = self.nursery_free
if self.nursery_free + totalsize <= self.nursery_top:
self.nursery_free = result + totalsize
......@@ -1491,7 +1523,7 @@ class IncrementalMiniMarkGC(MovingGCBase):
# being moved, not from being collected if it is not reachable anymore.
self.surviving_pinned_objects = self.AddressStack()
# The following counter keeps track of alive and pinned young objects
# inside the nursery. We reset it here and increace it in
# inside the nursery. We reset it here and increase it in
# '_trace_drag_out()'.
any_pinned_object_from_earlier = self.any_pinned_object_kept
self.pinned_objects_in_nursery = 0
......@@ -1625,7 +1657,9 @@ class IncrementalMiniMarkGC(MovingGCBase):
else:
llarena.arena_reset(prev, self.nursery + self.nursery_size - prev, 0)
#
# always add the end of the nursery to the list
nursery_barriers.append(self.nursery + self.nursery_size)
#
self.nursery_barriers = nursery_barriers
self.surviving_pinned_objects.delete()
#
......@@ -1950,7 +1984,7 @@ class IncrementalMiniMarkGC(MovingGCBase):
#
arena = llarena.arena_malloc(raw_malloc_usage(totalsize), False)
if not arena:
raise MemoryError("cannot allocate object")
out_of_memory("out of memory: couldn't allocate a few KB more")
llarena.arena_reserve(arena, totalsize)
#
size_gc_header = self.gcheaderbuilder.size_gc_header
......@@ -2058,7 +2092,7 @@ class IncrementalMiniMarkGC(MovingGCBase):
# XXX A simplifying assumption that should be checked,
# finalizers/weak references are rare and short which means that
# they do not need a seperate state and do not need to be
# they do not need a separate state and do not need to be
# made incremental.
if (not self.objects_to_trace.non_empty() and
not self.more_objects_to_trace.non_empty()):
......@@ -2148,9 +2182,9 @@ class IncrementalMiniMarkGC(MovingGCBase):
# even higher memory consumption. To prevent it, if it's
# the second time we are here, then abort the program.
if self.max_heap_size_already_raised:
llop.debug_fatalerror(lltype.Void,
"Using too much memory, aborting")
out_of_memory("using too much memory, aborting")
self.max_heap_size_already_raised = True
self.gc_state = STATE_SCANNING
raise MemoryError
self.gc_state = STATE_FINALIZING
......
......@@ -2,7 +2,7 @@ import sys
from rpython.rtyper.lltypesystem import lltype, llmemory, llarena, rffi
from rpython.rlib.rarithmetic import LONG_BIT, r_uint
from rpython.rlib.objectmodel import we_are_translated
from rpython.rlib.debug import ll_assert
from rpython.rlib.debug import ll_assert, fatalerror
WORD = LONG_BIT // 8
NULL = llmemory.NULL
......@@ -294,7 +294,7 @@ class ArenaCollection(object):
# be a page-aligned address
arena_base = llarena.arena_malloc(self.arena_size, False)
if not arena_base:
raise MemoryError("couldn't allocate the next arena")
out_of_memory("out of memory: couldn't allocate the next arena")
arena_end = arena_base + self.arena_size
#
# 'firstpage' points to the first unused page
......@@ -593,3 +593,10 @@ def _dummy_size(size):
if isinstance(size, int):
size = llmemory.sizeof(lltype.Char) * size
return size
def out_of_memory(errmsg):
"""Signal a fatal out-of-memory error and abort. For situations where
it is hard to write and test code that would handle a MemoryError
exception gracefully.
"""
fatalerror(errmsg)
......@@ -368,6 +368,13 @@ class AsmStackRootWalker(BaseRootWalker):
if rpy_fastgil != 1:
ll_assert(rpy_fastgil != 0, "walk_stack_from doesn't have the GIL")
initialframedata = rffi.cast(llmemory.Address, rpy_fastgil)
#
# very rare issue: initialframedata.address[0] is uninitialized
# in this case, but "retaddr = callee.frame_address.address[0]"
# reads it. If it happens to be exactly a valid return address
# inside the C code, then bad things occur.
initialframedata.address[0] = llmemory.NULL
#
self.walk_frames(curframe, otherframe, initialframedata)
stackscount += 1
#
......@@ -519,17 +526,15 @@ class AsmStackRootWalker(BaseRootWalker):
from rpython.jit.backend.llsupport.jitframe import STACK_DEPTH_OFS
tid = self.gc.get_possibly_forwarded_type_id(ebp_in_caller)
ll_assert(rffi.cast(lltype.Signed, tid) ==
rffi.cast(lltype.Signed, self.frame_tid),
"found a stack frame that does not belong "
"anywhere I know, bug in asmgcc")
# fish the depth
extra_stack_depth = (ebp_in_caller + STACK_DEPTH_OFS).signed[0]
ll_assert((extra_stack_depth & (rffi.sizeof(lltype.Signed) - 1))
== 0, "asmgcc: misaligned extra_stack_depth")
extra_stack_depth //= rffi.sizeof(lltype.Signed)
self._shape_decompressor.setjitframe(extra_stack_depth)
return
if (rffi.cast(lltype.Signed, tid) ==
rffi.cast(lltype.Signed, self.frame_tid)):
# fish the depth
extra_stack_depth = (ebp_in_caller + STACK_DEPTH_OFS).signed[0]
ll_assert((extra_stack_depth & (rffi.sizeof(lltype.Signed) - 1))
== 0, "asmgcc: misaligned extra_stack_depth")
extra_stack_depth //= rffi.sizeof(lltype.Signed)
self._shape_decompressor.setjitframe(extra_stack_depth)
return
llop.debug_fatalerror(lltype.Void, "cannot find gc roots!")
def getlocation(self, callee, ebp_in_caller, location):
......
......@@ -130,7 +130,7 @@ def stats_get_counter_value(warmrunnerdesc, no):
@register_helper(annmodel.SomeFloat())
def stats_get_times_value(warmrunnerdesc, no):
return warmrunnerdesc.metainterp_sd.profiler.times[no]
return warmrunnerdesc.metainterp_sd.profiler.get_times(no)
LOOP_RUN_CONTAINER = lltype.GcArray(lltype.Struct('elem',
('type', lltype.Char),
......
from rpython.rlib import rgc, jit, types
from rpython.rlib.debug import ll_assert
from rpython.rlib.signature import signature
from rpython.rtyper.error import TyperError
from rpython.rtyper.lltypesystem import rstr
from rpython.rtyper.lltypesystem.lltype import (GcForwardReference, Ptr, GcArray,
GcStruct, Void, Signed, malloc, typeOf, nullptr, typeMethod)
......@@ -57,7 +58,7 @@ class BaseListRepr(AbstractBaseListRepr):
elif variant == ("reversed",):
return ReversedListIteratorRepr(self)
else:
raise NotImplementedError(variant)
raise TyperError("unsupported %r iterator over a list" % (variant,))
def get_itemarray_lowleveltype(self):
ITEM = self.item_repr.lowleveltype
......
from rpython.rtyper.lltypesystem.lltype import Ptr, GcStruct, Signed, malloc, Void
from rpython.rtyper.rrange import AbstractRangeRepr, AbstractRangeIteratorRepr
from rpython.rtyper.error import TyperError
# ____________________________________________________________
#
......@@ -59,7 +60,10 @@ class RangeRepr(AbstractRangeRepr):
self.ll_newrange = ll_newrange
self.ll_newrangest = ll_newrangest
def make_iterator_repr(self):
def make_iterator_repr(self, variant=None):
if variant is not None:
raise TyperError("unsupported %r iterator over a range list" %
(variant,))
return RangeIteratorRepr(self)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment